Right after the recent HPC User Forum in Edinburgh, Scotland, we were kindly invited by the local forum organizer Mark Parsons, Director of the Edinburgh Parallel Computing Centre (EPCC) at the University of Edinburgh, to his office for an interview. EPCC is preparing for the next supercomputer, Archer2, but there are also bold plans to bring an exascale supercomputer to the UK end of 2022/beginning of 2023. Old mines play an important role in reusing the massive amounts of heat an exascale supercomputer will produce.
Were you happy with the HPC User Forum?
Mark Parsons: Yes, I was really happy with the User Forum. I haven't been to that many User Fora myself. I just started going to them a few years ago. Steve Conway asked me to organize this one about a year ago. There is a nice number of people. They tend to have 60 to 80 people. Here, we had a nice balance of speakers from around the world. People can speak quite openly at these fora. I think this is good.
Now we are at EPCC. We must say we've never been here before because it's a rather new building, it seems to us.
Mark Parsons: Yes, you're in the Bayes Centre, named after Thomas Bayes, known from the Bayes theorem. He was an Edinburgh student. This is a brand new building, opened last summer. It is focused entirely on organisations in the University and the local area that do data science. The big change for EPCC over the last few years is that we are now very focused both on supercomputing (HPC) and data science. We moved from quite old offices about three or four miles away from here and came down to here where we have 150 seats. We have got up a hundred EPCC staff here. Then I also have got a big area for our MSC and PhD students because we like having the students just embedded with us, so they can have coffee with us and just talk to us. It is a nice way of teaching and learning which is good.
What is the current status of the Centre?
Mark Parsons: We are biggest we have ever been, to be quite honest. For a long time, we were with 70 to 80 people. At the University, I was deeply involved in writing something called the City Deal. The City Deal receives money from the central government including the UK government and the Scottish government. It is supposed to increase innovation in your local region and this one is for the Edinburgh City region. The bid we put in was for a huge investment focused on data driven innovation. This building and another four buildings around the University are all focused on data driven innovation and working with the public, private and third sector in terms of embedding that in the local economy.
You are still operating one of the larger systems in Europe?
Mark Parsons: Archer, which is the current national system, is actually quite small and quite old. Now, it is just coming to end of life. It will be turned off in February 2020 and then Archer2 will come in. It will be one of the big systems. It will certainly be within the top 20 of the TOP500 list.
That is the UK national supercomputer?
Mark Parsons: Yes, that is the national HPC system for the UK. The predominant users of that are physical sciences users and environmental science users. We also have another national service, called the Tesseract service, and the DiRAC Consortium. That is focused on astronomy, particle physics and cosmology. Then we run a Tier 2 service as well that is mainly focused on industry and another small service to go around that. But the big investment recently actually isn't in HPC. As part of the City Deal, I received money to build a new data centre room and got GBP 20 million for that, and then I've got another GBP 90 million to build this data infrastructure over the next 10 years. It is really good because it is long-term funding for capital equipment to support all the data activities that we will have. I'm very happy with that.
The different machines serve different user communities, we assume?
Mark Parsons: It probably started accidentally but now obviously, it is a strategy. The UK has had a good model of effectively creating ramps towards different levels of computing. There is a very long tail of researchers. They need quite basically infrastructure support. Then, there is definitely a group who sits on what we call in the UK Tier 2 who are definitely already accomplished HPC users but they don't need a system with hundreds of thousands of cores. My Tier 2 is 10.000 cores or actually, a little bit more than that. The average job size of that will be quite small jobs of 64 or 128 cores whereas on the big national service Archer, we actively don't let users run small jobs like that. You want to push them towards thousands of cores. You have to meet everybody's needs. You just can't meet needs at either point.
There was an interesting fact in Satoshi Matsuoka's talk where he pointed out that the system that would be installed at RIKEN, the Post-K or Fugaku as they call it, will be broadly used. So Tier 2 users will still be happily invited on the Fugaku system as well as, of course, the really big capability users. I think that is very attractive because at the end of the day these capabilities are going to be bought by governments and they should be full of users, provided they operate them in a way that when a big user does come along wanting to use two, three or four hundred million cores, you can make way for them and let them do those really big jobs as well.
Talking about big machines, people in the HPC community are talking about exascale in the US, exascale in Europe, exascale in Japan, and exascale in China. We heard you are talking about exascale in the UK as well?
Mark Parsons: The UK has an interesting decision to make or would have had if Brexit hadn't happened, which would have been that we would probably have joined EuroHPC or would decide to do our own thing. Now, in some ways, the Brexit discussion has actually made that decision for us, although it is not entirely true at the moment because the funding for EuroHPC is through Horizon 2020, so we can still bid to these Calls. Once this current set of Calls is over, we won't be able to bid because we will need to be a member of the Joint Undertaking for EuroHPC and to be a member of the Joint Undertaking, you need to be either an Associated State or a Member State, and we are not going to be either. That imposes the question: What is the plan? What will the UK do?
We initially were having a discussion to take part in one of the EuroHPC consortia. It was decided that we wouldn't do that. There is a piece of work that I have been doing which has been looking at various options. I have been arguing very clearly with the government that we have to do something. Doing nothing is not an option, given that all of our competitor economies are going to be doing something. Then it comes down to the following: Does the UK join EuroHPC somehow? Does it partner with another country, like America or Japan, or even China? Does the UK build a system like it is happening in Europe, doing its own silicon, trying to do something really quite unusual? Or does the UK just buy a system from a vendor and benefit from all the other R&D that is going on around that? At the moment, the case we are making for a system in the UK, is basically that the UK should buy a system.
Buying a system means that you do that because you think it is important for scientific and industrial users to have access to such a kind of machine?
Mark Parsons: I would pose the question slightly differently. We know that supercomputing is important. We know that for every car we drive we need supercomputing. In some way, exascale is completely artificial. It is just a bigger computer. The supercomputing community throughout its history has set itself targets, such as the gigaflop era, the teraflop era, the petaflop era, and we are now going to the exaflop era. That is fine, that is how you make progress. But if the UK was simply to remain in the few tens of petaflops era, we would be overtaken by all of the major economies that we compete with around the world. So, you have to think of exascale computing as a fantastic capability but also make the case around the fact that you need to do it because your competitors are going to have access to it. If you don't have access to it, then your products and services will be non-competitive.
Are there typical UK applications that make the case for this type of machines?
Mark Parsons: I have been asked a very similar question recently by the government which was in the case that we have made, that the UK is very strong in computational science. The query came back to that statement to justify it. You suddenly think: Where do I go to justify that statement? Now, in truth, the way to justify it is to look at the strength of UK science as a whole. It is very strong if you look at the number of papers and the number of citations. The UK is known worldwide for the amount of money that goes into UK science, really punching very high up the ladder. We also know that many of these different scientific disciplines simply cannot experiment in the physical world to prove their theory or the things they discovered or they are interested in. The fact that we are doing this broad amount of science and lots of it, drives the fact that we are doing lots of computational science.
The work that PRACE did five or six years ago around what did the codes look like that each of us was running in our centres shows that there are differences in emphasis, but actually, the number of codes across Europe that are used broadly, are very similar in fact. I think that is true. I think there is a lot of overlap actually between us and the US, for instance. I think not so much with Japan because Japan has got this history of investing long term in particular applications that are important to their country, like earthquake modelling, for example.
What is the current status? Is there a kind of plan?
Mark Parsons: Where we are with this project is that there is a business case written for the government that exists in draft form. It will be submitted later this year and then it goes through a series of stages where it will be refined, questioned, maybe denied. I just don't know. At the end of the day it is a very large amount of money and the government will have to take a decision. I just don't know what will happen. But it is important that we have done the work throughout the case.
If it happens, what are the time steps and what is the time scale?
Mark Parsons: The time scale that is currently foreseen is the following. The discussion in the case basically says that if we wait until say 2026, in America they will be going to 10 or more exaflop systems. That is when they do their renewal of the 2021 systems. We don't particularly want to try and do something in 2021 because actually I think these systems are going to be really tricky to deploy. Lots of work has to be done around them because they really are phenomenal bits of IT engineering. However, there is definitely a sweet spot around maybe late 2022 or early 2023 where you can benefit from all the work that has been going on in 2021 and early 2022 and deploy the technologies that have actively been debugged by the people who are right in the front of the queue. That is the time scale we are looking after.
If you are preparing for such an exascale system, we suppose you cannot place it in your current data centre, or can you?
Mark Parsons: This is one of the things that isn't talked about enough. Hosting these systems is going to be very difficult. The positive side is that we are getting much more information about how big these systems will be. I am building a new computer at the moment and perhaps an exascale system will fit in it or I can extend it a little bit. It is not just about the room itself. It is about delivering the power and the cooling, and also worrying about the environment because these systems are going to consume anywhere between 20 and 40 megawatts of electricity, producing a significant amount of carbon dioxide.
At the moment, I have already started putting into place the 32 megawatt power supply. I have eight megawatts already and I am adding another 24. We are also looking at what we do with all the hot water that comes out of the system. We are entering a new phase of supercomputing because these computers are going to have water going into them that is hotter than it ever gets in Scotland. I won't need to chill the water any longer. That is great, but then what do I do with this hot water? One of the ideas that we are exploring is using that hot water to pump down into old mine workings under the data centre and create a heat battery where the heat can then be taken out with heat pumps to heat buildings up to five or ten miles away. We are very excited by this. It is just at the planning stage at the moment. We are doing a big feasibility study. I think we do need to think about this sort of things as the amount of power we are consuming in supercomputing goes up and up and up.
When we understand correctly, you just are collecting the heat that comes out of the computer but you also have a way to store it. Currently, most people just heat buildings that are around.
Mark Parsons: If you have an underground river or a lake or, as we have, old mine workings, you can heat the water in the mine workings up. It gradually heats the rock up and the rock is really good at storing that heat. So, you can effectively create a heat battery.
This sounds like an interesting idea. Thank you very much for this interview.