Culture / Data

Distributed Computing at Work: 600 Million Collisions to Make a “Pentaquark” Discovery in Particle Physics

22 Jul 2015 1:29pm, by

We are sometimes reminded of the power that comes with modern scale computing when seeing how scientists are embracing the ways they can use distributed infrastructures to make fantastic discoveries. A discovery at CERN’s Large Hadron Collider last week illustrates this well.

It came through scientists’ discovery of the “pentaquark” last week at CERN’s Large Hadron Collider, as part of the ongoing LHCb (Large Hadron Collider “beauty”) experiment. According to Wikipedia, a pentaquark is a hypothetical subatomic particle consisting of four quarks and one antiquark bound together. Scientists have been theorizing about the elusive pentaquark for five decades, but this marks the first time that researchers found unambiguous evidence of its existence.

“The pentaquark is not just any new particle,” explains LHCb spokesperson Guy Wilkinson in regards to the particle’s exotic state in the press release. “It represents a way to aggregate quarks, namely the fundamental constituents of ordinary protons and neutrons, in a pattern that has never been observed before in over 50 years of experimental searches. Studying its properties may allow us to understand better how ordinary matter, the protons and neutrons from which we’re all made, is constituted.”

Accidental Breakthrough

The discovery happened almost unexpectedly, when scientists were studying post-collision, particle decay data of another particle called the “bottom lambda baryon,” a three-quark particle. “It was a complete accident,” says Sheldon Stone, a physicist at Syracuse University in New York and co-author of the recent study. “[The pentaquark] found us.”

What’s more, it appears that the pentaquark comes in two forms: one where the five quarks are tightly bound in one package, or in another where the quarks are loosely bound in two, “molecular state”-like packages, though scientists are still unclear on the specifics.


New Infrastructures Make it Possible

But these insights would all not possible without the massive, new technology infrastructure that underpins the whole LHC project. For one thing, the approximately 600 million collisions of particles per second in the 27 kilometer (16.7 mile) long particle accelerator generates massive amounts of data — which are recorded as a series of electronic signals, and sent to the CERN Data Centre to be digitally reconstructed. Around 30 petabytes of “collision event” data is generated per year, which scientists must comb through and analyze, to find particles like the pentaquark.

To parse this flood of information, CERN has used a worldwide, distributed computing grid since 2002, which provides a group of over 8,000 physicists almost instant access to the data. This computing grid builds upon the World Wide Web, which originated at CERN as a way for scientists to share information. On site, its Data Centre processes about one petabyte of data per day, using 11,000 servers with 100,000 processor cores — while off-site, the Worldwide LHC Computing Grid (WLCG) handles more than two million jobs per day.


This staggering technological infrastructure and the distributed computing network is what makes these unprecedented discoveries into the fabric of the universe possible. Our thirst for a deeper knowledge into the very constituents of matter — and by extension, the meaning of the universe — is boundless, and it’s fitting that these micro-scaled explorations are sustained by macro-scaled interconnections that make those profound revelations a reality. More over at CERN.

Images via CERN.

A newsletter digest of the week’s most important stories & analyses.

View / Add Comments

Please stay on topic and be respectful of others. Review our Terms of Use.