How do you build the world's fastest supercomputer?

AUBREY LOVELL
Hello and welcome back to Technology Now, a weekly show from Hewlett Packard Enterprise where we take what's happening in the world and explore how it's changing the way organizations are using technology.

We’re your hosts Aubrey Lovell,

MICHAEL BIRD
and Michael Bird, and this week we are answering a very simple question: how do you build the world’s fastest super computer?

- We’ll find out what makes a supercomputer… a supercomputer

- We’ll be exploring the different building blocks of a supercomputer

- And we’ll be asking, is there a limit to how big these machines can get?

AUBREY LOVELL
That is correct, so if you’re the kind of person who needs to know *why* what’s going on in the world, matters to your organisation, this podcast is for you.

And if you haven’t yet, subscribe to your podcast app of choice so you don’t miss out.

Alright Michael, let’s get into it!
MICHAEL BIRD
Yeah, let’s do it!

MICHAEL BIRD
I think we can all agree that computers are one of the marvels of modern technology –

AUBREY LOVELL
Absolutely can't imagine a life without them.

MICHAEL BIRD
Yeah, I mean we carry them in our pockets. I'm pretty sure they're embedded into pretty much everything we interact with.

AUBREY LOVELL
They're even in fridges now.

MICHAEL BIRD
I've seen a fridge that has cameras in it and it can tell you when stuff is gone. Anyway, the amount of stuff we can do on them is really quite remarkable. But what happens when your standard laptop or even your company's server room isn't enough? Well, you need to scale it up.

AUBREY LOVELL
Enter the supercomputer.

These titans of engineering perform tasks, models, and simulations of some of the most complicated processes in our universe covering everything from weather, to atomic physics, and even discovering new drugs.

Currently the world’s fastest supercomputer, El Capitan, is housed at Lawrence Livermore National Laboratory where it was created in collaboration with the U.S. Department of Energy and the National Nuclear Security Administration to ensure the safety and security of the United States’ nuclear arsenal (among other things).

MICHAEL BIRD
Now the world’s most power supercomputer is, obviously, a huge topic so to give it the respect it deserves, we’ve split this topic across two episodes.

This week, we are going to look at the process of building the world’s largest supercomputer, and next week we are going to explore what you actually do with it!

AUBREY LOVELL
That’s right, Michael.

And, to kick us off on this topic, we are thrilled to be joined by Bronis de Supinski, the CTO of Livermore Computing at Lawrence Livermore National Laboratory.

MICHAEL BIRD
Bronis, welcome to the show. So first question, in one minute or less, what makes something a supercomputer?

BRONIS DE SUPINSKI
Well, a supercomputer is really just a computer that has more capability than the vast majority of other computers available at that time. So what's a supercomputer today, 10 years from now might be sitting in your pocket. That's happened quite a bit of times with computers that I've worked on during my career is… now they sit in my pocket.

MICHAEL BIRD
Wait, so you've worked on what at the time you called a supercomputer, but nowadays is just as powerful as a smartphone that I have in my pocket.

BRONIS DE SUPINSKI
That's right. That's the way technology has gone.

MICHAEL BIRD
So by the definition of a supercomputer today, how does it differ from the laptop sat on my desk or a server racked in a server room?

BRONIS DE SUPINSKI
It depends on the supercomputer, so sometimes the technology can be very different, but oftentimes it's just very similar technology but a lot more of it. The individual nodes of a supercomputer are very similar to the computers in your laptop or your desktop system. But there’s a lot of them.

MICHAEL BIRD
Yeah, okay, okay. Well, I’m gonna come onto that later, because I really want to ask you about the of the details of that. But can you sort of explain you measure the size of a supercomputer?

BRONIS DE SUPINSKI
The most frequent measurement that we use to rank supercomputers is the performance on a benchmark called LINPACK or high performance LINPACK HPL. And that basically is solving a system of linear equations and how large of a system can you solve and how much time. And so that then translates to a number of floating point operations per second that can be achieved.

MICHAEL BIRD
And, the El Capitan, what’s the flops number?

BRONIS DE SUPINSKI
what we report is peak is, you know, so they're guaranteed not to exceed performances. Two point seven four exaflops. So that's 10 to the 18th flops. What it got on LINPACK was one point seven nine exaflops. So it actually was doing a somewhat useful computation at a rate of almost two exaflops.

You'd need more than a million of your smartphones to be able to that computation and to perform computation at that same rate. The thing is, that then they also, because it's solving a single problem, they have to be able to communicate and work on that problem together. So you wouldn't be able to get your, you know, a million smartphones to compute at that rate because they don't have the ability to communicate quickly enough

MICHAEL BIRD
So does that mean that El Capital is an exascale computer? And if that's the case, what does an exascale computer mean?

BRONIS DE SUPINSKI
So El Capitan is certainly an exascale computer because by pretty much all of the metrics one might talk about in terms of scale, it's 10 to the 18th is kind of dominates throughout. So for us, because we're focused primarily on modelling and simulation, we're very concerned with being able to do double precision floating point operations. So that's 64-bit floating point operations. Our application teams need that level of precision to get accurate answers. And so when I said the 2.74 peak, that's double precision floating point operations

MICHAEL BIRD
So I imagine you can't just pick up the phone or hop online and just quickly order yourself an El Capitan. I suspect it's quite a complicated process. So I wonder if I could just ask you about how do you go about building a supercomputer? So like, what's the first thing you need to do when you're setting out to build the world's fastest supercomputer?

BRONIS DE SUPINSKI
If you are looking at doing something in a cost effective way, really managing your money well, then you need to... work with the technology providers, the system integrators, the processor manufacturers, understand what they can do and work with them closely to figure out several years in advance how we can push the envelope. Because it really is about building something beyond what anyone else is able to build at that time. And so that takes time, it takes investment, also takes a lot of close work looking at here's the problems we want to solve and here's the ways that we could evolve technology to solve them better. So the US Exascale Program actually took place over more than a decade really.

MICHAEL BIRD
So in terms of the sort of practical, environment for where the supercomputer lives, like what other facility needs, like, you know, is it a big air-conditioned building? Does it need to be, close to a power station?

BRONIS DE SUPINSKI
you know, I think our site is kind of a good example of what you need nonetheless. Right. So first of all, Lawrence Livermore is where just across the road, literally. We have a very large substation from one of the two main power providers in California. And also not very far away is another substation for the other big electricity provider. So our site, happens to be very well located in that we have access to very large amounts of power. And the amount of power you need to run a supercomputer is large and it keeps getting larger. But then you need a whole bunch of infrastructure to bring that power from the distribution point at your site actually into the building where you're going to have the system.

Prior to El Capitan, our centre had the capability to deliver 45 mega watts to our main computer and floor. We actually run our centre so that we are able to have the current system still running while we’re fielding and deploying the new system. So that means we actually want to be able to run two very large systems at the same time
In addition, you need to be able to cool the computer. So when you’re using a lot of electricity, moving all those electrons generates a lot of heat. All right. So our building was originally designed to house air cooled systems, but we also had the ability to do water cooled. So that sits just outside of our building is the ability to take all of that electricity and bring it into the building. We have six very large cooling towers that we added for the extra cooling capability. Then you have to have all the infrastructure to bring it into the building and then distribute it to the actual computer, which occupies a fairly large amount of floor space in and of itself.

MICHAEL BIRD
And so if El Capitan is water-cooled, does that mean it's… from a power and energy perspective, is it more efficient?

BRONIS DE SUPINSKI
Yeah, so it's actually a lot more efficient because that means that we can run the clocks at higher speeds and much more efficiently remove that heat, right? So the cooling is really a key aspect to energy efficiency

AUBREY LOVELL
Thanks so much Bronis. It’s always nice to see previous Technology Now topics like liquid cooling coming up so we can see them in practice.

I also just want to quickly define something which we just heard from Michael and Bronis. We measure the speed of a supercomputer in FLOPS (which is floating point operations per second, for the uninitiated) but we didn’t ask what a floating-point operation is! Luckily the answer is relatively simple: floating point operations are a type of math that use floating point numbers which can be thought of as just a different way to write the numbers that we use.

Anyway, looking forward to hearing more about how we build a supercomputer later in the show.

MICHAEL BIRD
OK, so… now it’s time for “Today I learned”, the part of the show where we take a look at something happening in the world that we think you should know about. Aubrey, what have you got for us this week…

AUBREY LOVELL
So I assume you’ve seen films where people use infrared cameras, right Michael?

MICHAEL BIRD
Yeah, to sort of see through things.

AUBREY LOVELL
Like a lot of like military, high impact movies, you know, that you see them using this type of technology, right?

MICHAEL BIRD
Yeah, yeah, or I've seen those sort of, I think you'd call them a cop show, where they're following somebody in a helicopter at night and they're sort of hiding in a woods and they use infrared cameras to find them.

AUBREY LOVELL
Well, in a paper published in the journal Cell, researchers in China have announced the creation of contact lenses which respond to infra-red light, turning it into visible light, and allowing the wearer to see something which would have previously been invisible.

That's incredible and I also wear contact lenses, so this is very exciting for me, but I also am like the amount of power you could have to like be able to see in the dark. I'm already blind, so like to be able to have that in the middle of the night would be crazy. Unlike current night vision goggles, which are bulky and kind of require power, the new contact lenses are easily wearable and do not require any external power source. However, these changes do come with a couple of setbacks. I knew that there was a catch here.

The lenses do not provide any sort of detailed vision yet as they only respond to high intensity infra-red light however the researchers note that even this could be useful for sending discreet messages in security, search and rescue, and military scenarios. Very interesting.

Obviously, like any wearable tech, there are a few safety concerns. The contact lenses work by using nanoparticles to convert the infra-red into visible light and there are worries about these leaking into the eyes however, they still mark a step forward towards what could be an exciting new technology.

MICHAEL BIRD
That is exciting. That is exciting. I've sort of following some photographers who have been taking off filters from their cameras so they can shoot in infrared. And everything looks a bit sort of interesting looking. I imagine that'd be quite cool. the world would look very, different with everything in infrared. Do you think you could turn the TV on and off just by sort of blinking? That'd be quite fun, wouldn't it?

AUBREY LOVELL
That would be next level. But I will say that last paragraph kind of got me like, I might want to be late adopter to this because I don't want any type of infrared leaking into my eyes. So no, thank you.

MICHAEL BIRD
The, nanoparticles is maybe what I'm slightly worried about leaking into my eyes, but anyway.

Right then. Now it’s time to return to our guest, Bronis, to find out exactly what sort of kit we need to build a supercomputer like El Capitan.

So what sort of hardware do you need for a supercomputer? I mean, it the same sort of traditional hardware that you would have, you know, as a standard server sat in a rack, storage, networking, compute, memory, or are there some additional components that make El Capitan special?

BRONIS DE SUPINSKI
Well, there are some components that make it special. But, at the very most basic level, there are processors and there are solid state drives, there's hard disks. The file system actually is still air cooled So there's fans on that. There's something that you don't have in your laptop or your desktop is we have cooling distribution units that move the liquid through the cabinets. So that's a little bit unusual. The processors are In El Capitan are somewhat special. In most supercomputers, we just have faster CPUs and GPUs than you have.

El Capitan has something that's called an APU, which is an Accelerated Processing Unit. And that's a single device that actually brings together CPU cores and GPU cores. The MI300A is the first server class APU. So that's pretty special. The device we have that houses our SSDs is a bit special in that it allows the SSDs to be directly mounted by multiple. compute nodes, which is kind of unusual. But, you know, in the end, it's, it's a large amount of processors and, and storage devices and they're not that different at some level. They're higher speed, higher density, higher quality

MICHAEL BIRD
I mean how many processors, GPUs, SSD, like how many are we talking? Hundreds? Thousands?

BRONIS DE SUPINSKI
so supercomputers today typically have multiple what we would call compute node. So, El Capitan it's has a little over 11,000 so 11,136 compute nodes. All right, and each of those compute nodes has four APUs. So there's 44,544 APUs in El Capitan. So that's a fairly large number of processors in one system.

MICHAEL BIRD
So does a supercomputer just sort of, run off the shelf software? You know, does it run like a normal desktop operating system or is it something a bit more specialized?

BRONIS DE SUPINSKI
El Capitan uses what we call the TriLab operating system software or TOS, but that's just built on a standard Linux distribution. Now we do some things where we… We remove a lot of things that you don't need. So it's running many fewer demons. The compilers that we use are built largely on open-source compiler infrastructures. There are proprietary ones also, but that's what we use to generate the machine code for the applications that we run.

The applications that we run are typically, you know, homegrown applications that you know, there are ISVs that have applications that run also, but those are not so much ordinary applications typically because they have to be programmed to be able to run across a large number of processors

MICHAEL BIRD
So the soft of software and hardware come together. Presumably there's a period of testing it, making sure everything is, you know, connected correctly and is talking to each other correctly. Like how long does that take?

BRONIS DE SUPINSKI
With El Capitan, there were some things that led us to actually deploying all the cabinets and all the networking infrastructure. And then, once we started installing the compute nodes that took about two and a half three months I think because you know it takes a long time to actually like bring them in, slot them, bring them up, make sure that they're running correctly, that nothing got damaged in shipping and then it takes a bit more time to really make sure everything is is running correctly and running well. Honestly, people will be a bit aghast, but it takes a really, really large system to get it up and running It takes almost a year.

MICHAEL BIRD
So what would need to be done to be able to build a computer even bigger than El Capitan? I mean is it as simple as just, you know, if it was 11,000 nodes you just buy 22,000 nodes or actually are there some technical or even physics-based challenges that mean that it's not as simple as doing that?

BRONIS DE SUPINSKI
You certainly eventually reach a point where it's not connected in the same way, right? Alright. Where the networking connections become kind of untenable. Doubling the size would primarily be a matter of money. I mean, there also is also just like supply chain issues, like getting all of that stuff manufactured is also very complicated, very time consuming. Ten times as much, there become more challenges involved in that and being able to wire it together in the same tightly connected way, it becomes very difficult. Then, you know, like all of the nodes in El Capitan are able to communicate with each other at fairly high bandwidth and low latency, know, so fairly quickly and at high data volumes. All right. At some point, it becomes difficult to build a network that does that.

AUBREY LOVELL
Bronis, this has been utterly fascinating. Thank you so much for your time. And with the speed of progress in the tech world, I think it’s particularly interesting to consider how long it takes to build these machines as I imagine at some point we will start approaching a maximum speed for supercomputers so I guess the (pretty exciting) questions start to arise… where do we go from here? What’s next?

MICHAEL BIRD
Right then, we are getting towards the end of the show which means it’s time for This Week in History. Aubrey, remind me of last week’s clue?

AUBREY LOVELL
So the clue was: it’s 1943, and this patent has just been granted for something that almost every single one of us uses daily…

MICHAEL BIRD
Yeah, and we were talking about whether it was an appliance or something. think I said refrigerator, I think you said microwave.

AUBREY LOVELL
Yes and… in this case, we were going in completely the wrong direction because we should have been thinking much, much smaller… and I don't think that helps me at all. I have no idea.

MICHAEL BIRD
Right. What is it…

AUBREY LOVELL
Okay, so I need you to do something. On your last hint, I need you to write this down for me. You ready? So write down technology now.

I want you to look in your hand. What are you holding….

MICHAEL BIRD
A… piece of paper???

AUBREY LOVELL
So this is actually the story of …. The biro, which is a ballpoint pen.

MICHAEL BIRD
My other hand, my other hand!

AUBREY LOVELL
Your other hand, sorry I was not clear! You were probably just looking a blank hand like, “what are you doing?”

MICHAEL BIRD
Ha ha ha

AUBREY LOVELL
In 1943 , the first successful ball point pen was created by a pair of Hungarian brothers: László and Győrgy Bíró. Initially, their invention was patented in Britain before the advent of World War Two, however the brothers were forced to flee Hungary, to Argentina, to avoid persecution so the pen itself was not marketed until 1943 after they were granted an Argentinian patent.

So Laszlo had wanted to create a pen which had fast drying ink, and didn’t smudge - unlike the fountain pens used at the time. Working with his brother, they created a thick, viscous ink which paired with the ball point of the pen to create the pen we all known and use today.

I do want to give you a little credit for one of the things you mentioned last week, Michael, because you did say something about the “space race”. Now, there is a myth that the Biro was created as part of the space race which is sadly untrue, however NASA did transition to what was known as a “space pen” in the late sixties which was a pressurised ballpoint pen which could be used in zero gravity which is pretty cool.

MICHAEL BIRD
I remember going to one of those shops in London that has like sort of sellers everywhere that would sell the new fangled thing. I remember this has been a bit like mid nineties selling the space pen and you could buy the space pen and everyone was like, “ooh the space pen”. But most people use the pen normally.

AUBREY LOVELL
Right? It's not like you could actually test it out, which would be cool if you could, but… what do you have for us next week, Michael?

MICHAEL BIRD
Ok so it’s 1963 and this 26-year-old cosmonaut is about to set an off-world record…

MICHAEL BIRD
Cosmonaut is the clue, so it's… it’ll be like a USSR… Yeah I’m going to say maybe Yuri Gagarin goes to space but I think that was probably the ‘50s. I probably got that wrong.

AUBREY LOVELL
Well I’m going to double that because I have no idea who that is but I trust you, Bird.

AUBREY LOVELL
Okay that brings us to the end of Technology Now for this week.

Thank you to our guest, Bronis,

And of course, to our listeners.

Thank you so much for joining us.

If you’ve enjoyed this episode, please do let us know – rate and review us wherever you listen to episodes and if you want to get in contact with us, send us an email to Technology Now at HPE dot com. We would love to hear from you.

MICHAEL BIRD
Technology Now is hosted by Aubrey Lovell and myself, Michael Bird
This episode was produced by Harry Lampert and Izzie Clarke with production support from Alysha Kempson-Taylor, Beckie Bird, Paul Rosien, Alissa Mitry and Renee Edwards.

AUBREY LOVELL
Our social editorial team is Rebecca Wissinger, Judy-Anne Goldman and Jacqueline Green and our social media designers are Alejandra Garcia, and Ambar Maldonado.

MICHAEL BIRD
Technology Now is a Fresh Air Production for Hewlett Packard Enterprise.

(and) we’ll see you next week. Cheers!

Hewlett Packard Enterprise