Deprecated: Assigning the return value of new by reference is deprecated in /Users/web_old/olympicswiki/inc/parserutils.php on line 219

Deprecated: Assigning the return value of new by reference is deprecated in /Users/web_old/olympicswiki/inc/parserutils.php on line 222

Deprecated: Assigning the return value of new by reference is deprecated in /Users/web_old/olympicswiki/inc/parserutils.php on line 359

Deprecated: Function split() is deprecated in /Users/web_old/olympicswiki/inc/common.php on line 798

Warning: Cannot modify header information - headers already sent by (output started at /Users/web_old/olympicswiki/inc/parserutils.php:219) in /Users/web_old/olympicswiki/inc/actions.php on line 102
lhc_olympics:data_file_format [LHC Olympics]
Warning: date() [function.date]: It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected 'America/New_York' for 'EDT/-4.0/DST' instead in /Users/web_old/olympicswiki/inc/template.php on line 187
 

How to Read LHC Olympics Data Files

The data files are ordinary text files with a long list of “events”, proton-proton collisions whose spray of outgoing particles has been deemed sufficiently interesting by the (simulated) detector. Each event consists of a set of rows in the data file. Each row corresponds to an “object”: a lepton, photon, jet, or missing transverse momentum. More information about how objects are identified appears on the particle identification page.

You can write your own software to parse these text files and analyze them yourself. Alternately, you can take advantage of the extremely user-friendly Mathematica package Chameleon analysis package developed by Natalia Toro and Philip Schuster at Harvard.

Note the data format has changed since the Second LHC Olympics so be sure to read the following. The format for b-tagging has changed, the location of charge and some mass information have been adjusted, the default muon isolation cut has been removed, and some other details have changed as well. Thus some conversion of your old software will be necessary. You can also use the cleaning script to convert the current format to something closer to the old one if you desire.

Note also that the definition of a jet has changed in the new version of PGS. Cone jets have been replaced with kT jets. This will affect the appearance of kinematic distributions compared to the earlier version used in prior Olympics.

Column Formats

   #   typ     eta    phi       pt  jmass  ntrk  btag   had/em  dummy dummy  
  • The first column of each row is just a counter that labels the object.
  • The event begins with a row labelled “0”; this row contains the event number and the triggering information. The last row of the event is always the missing transverse momentum (MET). The next event again begins with a row labelled “0”, etc. If you are just beginning, this triggering information is probably not of much interest to you and can be ignored. The rest of the rows are the physics objects in the event, in which you are definitely interested.
  • The second column indicates the type of object whose properties are given in the row. In particular:
    • 0 = photon
    • 1 = electron
    • 2 = muon
    • 3 = hadronically-decaying tau
    • 4 = jet
    • 6 = missing transverse energy
  • The next three columns give the pseudorapidity, the azimuthal angle, and the transverse momentum of the object. For massless objects, this infomation can be used to construct the entire four-vector of the physics object.
  • The sixth column gives the invariant mass of the object; for a jet, this is a constructed from all the energy and momentum that are contained within it. In order to calculate the correct object four-vector for jets, one must be sure to include the invariant mass information.

The above is sufficient to specify the kinematic information in the event. One can make a lot of progress with just this information. We have output some additional information that can also be useful:

  • The seventh column gives the number of tracks associated with the object; in the case of a lepton, this number is multiplied by the charge of the lepton. (Thus a muon will appear as -1, a positron as +1, a tau- as -1 or -3, and a jet a positive number or zero.)
  • The eighth column is 1 or 2 for a jet that has been “tagged” as containing a b-quark (actually a heavy flavor tag that sometimes indicates c-quarks), otherwise it is 0. The difference between 1 or 2 is described in the heavy flavor tagging section. For muons, this column has a special meaning. The integer part of this number is the identity of the jet (see column 1) that is closest ot this muon in Delta R. This information can be used later by the cleaning script. There are a few more details where muons are discussed.
  • The ninth column is the ratio of the hadronic versus electromagnetic energy deposited in the calorimeter cells associated with the object; it is typically >1 for a jet and « 1 for an electron or photon. One reason why this column is useful is that photons and electrons are only identified out to eta = 3.0 whereas jets are clustered out to eta = 5.0, so an energetic jet with 3.0 < eta < 5.0 is very likely an electron or a photon if the had/em fraction is low. Again, this column has special meaning and format for muons. The format is xxx.yy. To the left of the decimal point (the ‘xxx’) is ptiso,the summed pT in a R=0.4 cone (excluding the muon). To the right of the decimal point is etrat, which is a percentage between .00 and .99. It is the ratio of the transverse energy in a 3×3 grid surrounding the muon to the pT of the muon. For well-isolated muons, both ptiso and etrat will be small. The values can be used later by the cleaning script, or by advanced users who desire to design their own muon isolation cuts.
  • The tenth and eleventh columns have been added for possible future use; in this round of the Olympics they are always zero.

Some Sample Events

A typical event may look like

   #   typ     eta    phi       pt  jmass  ntrk  btag   had/em  dummy dummy  
   0           103   2563                                                    this is event number 103, and its trigger word value is 2563 
   1    2   -1.219  4.739   449.95   0.11   1.0   0.0    12.15   0.0   0.0   a (positively-charged) muon with a pT of 450 GeV, ptiso= 12 GeV, etrat=0.15 
   2    4   -1.729  1.557   687.76 592.46  37.0   0.0     4.41   0.0   0.0   a jet with a pT of 688 GeV, invariant mass of 592 GeV, and 37 charged tracks 
   3    4   -0.829  2.540    67.26  20.33   5.0   0.0     3.55   0.0   0.0   a jet with a pT of 67 GeV, invariant mass of 20 GeV, and 5 charged tracks 
   4    6    0.000  4.857   275.16   0.00   0.0   0.0     0.00   0.0   0.0   the "missing transverse energy" in the event is 275 GeV 

Here is another one, probably containing jets from a b quark and an b antiquark (one jet is tagged with a displaced vertex, the other has a nearby soft muon)

   #   typ     eta    phi       pt  jmass  ntrk  btag   had/em  dummy dummy  
   0             5   3587                                                    this is event number 5, and the trigger word is 3587
   1    2    1.169  4.197    6.30    0.11   1.0   3.0     0.00   0.0   0.0   a muon with a pT of 6 GeV, the 3 in the b tag column tells you it is close to the third object
   2    4   -0.121  1.278  330.12  206.58   6.0   2.0     3.50   0.0   0.0   a jet that passed a "tight" b-tag criterion
   3    4    1.207  4.216  306.56   27.99  16.0   0.0     0.73   0.0   0.0   the jet that is close to the muon
   4    4   -0.357  5.635   79.27   10.92   8.0   0.0     1.31   0.0   0.0   
   5    4   -0.965  4.076   17.42    7.24   3.0   0.0     0.63   0.0   0.0
   6    4   -2.073  0.696    8.75    4.07   1.0   0.0     1.93   0.0   0.0
   7    4   -3.717  1.975    6.81    2.30   1.0   0.0     0.15   0.0   0.0
   8    6    0.000  1.926   12.42    0.00   0.0   0.0     0.00   0.0   0.0

This event has an energetic electon and positron:

   #   typ     eta    phi       pt  jmass  ntrk  btag   had/em  dummy dummy  
   0             3   3599
   1    1   -0.060  2.878  359.51    0.00  -1.0   0.0     0.02   0.0   0.0   electron
   2    1    0.398  6.041  368.07    0.00   1.0   0.0     0.01   0.0   0.0   positron
   3    4    3.516  4.651   25.72   36.62   5.0   0.0     1.44   0.0   0.0
   4    4   -0.036  1.763   13.38   10.65   1.0   0.0     0.83   0.0   0.0
   5    4   -2.793  3.500   12.06    7.75   5.0   0.0    12.24   0.0   0.0
   6    4    1.068  1.243   10.00    4.53   3.0   0.0     7.71   0.0   0.0
   7    4   -3.969  0.688    9.79    3.79   5.0   0.0     7.78   0.0   0.0
   8    6    0.000  2.612   11.76    0.00   0.0   0.0     0.00   0.0   0.0

Kinematics

  • pseudorapidity — “eta” is related to the angle “theta” relative to the beam axis via eta = -ln[tan(theta/2)]. For massless particles, eta is the same as the rapidity y.
  • azimuthal angle — the angle “phi” is the angle around the beam axis in cylindrical coordinates.
  • R — denotes the angular distance Sqrt[(eta2-eta1)^2+(phi2-phi1)^2] as measured in (eta,phi) space.
  • transverse momentum — components of the momentum orthogonal to the beam axis.
  • invariant mass — the root square of the sum of the four-momenta of two (or more) objects. The invariant mass of the jet is the root square of the sum of all the mini-object four-momenta. Note that this is generally much larger than the mass of the quark which generated it.
 

Warning: date() [function.date]: It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected 'America/New_York' for 'EDT/-4.0/DST' instead in /Users/web_old/olympicswiki/inc/template.php on line 634
lhc_olympics/data_file_format.txt · Last modified: 2006/12/19 11:31 by olympian
 
Recent changes RSS feed Creative Commons License Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki