At the EGI Conference 2016 in Amsterdam, The Netherlands, we had the opportunity to talk with Eckhard Elsen, Director of Research and Computing at CERN in Geneva, Switzerland. Eckhard Elsen talked about some of the programmes that are being performed at CERN. The main business at CERN is fundamental physics and particle physics in which the researchers collide protons at very high energy and look at the products of the collision. The collisions are registered in very huge detectors that produce enormous data rates and data streams for analysis.
The principle of operation at the CERN Hadron Collider is to have the protons collide and register the data. The physics group then sits together to seve through the data samples to understand what is in it. Since they are looking for vary rare collisions - the collision is a proper ballistic process - they have to seve through a lot of data to really find the new physics that they are looking for. An example is the Higgs discovery that happened four years ago. After a long search for that particle, when it was finally found, it let to the Nobel Prize offered to François Englert and Peter W. Higgs.
The researchers would like to repeat such a fundamental discovery again. That is why they increase the interaction rate, the strings of the collision, which is called 'luminosity'. What they want to do over the next six years, is to gain one order of magnitude in the interaction rate and then they are looking forward to gain another order of magnitude by 2025. The data rate at these times already is among the biggest created artificially in the world. If you add a factor of 100, you can understand that the researchers will have one of the biggest computing needs whatsoever.
The technology CERN is using since 15 years or so is the worldwide HC computing Grid which has been performing well. It consists of a federated network of computers which are fast, with links in between them so the researchers can analyze the data, be it in the US, in Europe, or in Asia. The next step of this evolution is a somewhat more generalized approach which typically is cast into the word "Cloud". What the researchers are moving to is the so-called scientific Cloud. It is an environment where the data can be analyzed without one knowing in detail what type of computer it is that is doing the analysis and where it is located. The researchers have to store the data for a long period so they also need persistant data storage, one of the other requirements in their business.
The researchers are very much looking forward to the developments that are now emerging in many kinds of sciences in which the scientific Cloud is the buzz word of the future. The researchers want to put some flesh onto that idea.
CERN for the first time was part of the ESFRI roadmap with the High Luminosity - Large Hadron Collider. Was that a kind of collaboration?
Eckhard Elsen: It is a slight change in the philosophy at the European Commission. There was an ESFRI Roadmap in 2012 which had the same project included but at that time the consideration was that the high energy physics group worldwide had evaluated the project and had set its veto. In fact, the CERN LHC was mentioned in an appendix to that roadmap. Now, the High Luminosity LHC project is gaining so much exposure that it simply needs to be on the ESFRI Roadmap. Such a prestigious project needs to be mentioned. It has found its place, not as a future project, that still needs to go through the approval phase. This is a so-called landmark which assumes that it simply will be done.
If you talk about large computing and large data needs can you put some numbers on it?
Eckhard Elsen: To put it simply, one number that was heard at the conference was that more than 60 percent of the traffic that EGI looked at is generated by research scientists. The researchers want to preserve the data for a long time. The data storage that needs to be provided, is huge. We are talking about many hundreds of Petabytes of data and this will grow unless the researchers learn how to reduce the data volume by making the data much more compact.