Using REDUCE in High Energy Physics

Free download. Book file PDF easily for everyone and every device. You can download and read online Using REDUCE in High Energy Physics file PDF Book only if you are registered here. And also you can download or read online all Book PDF file that related with Using REDUCE in High Energy Physics book. Happy reading Using REDUCE in High Energy Physics Bookeveryone. Download file Free Book PDF Using REDUCE in High Energy Physics at Complete PDF Library. This Book have some digital formats such us :paperbook, ebook, kindle, epub, fb2 and another formats. Here is The CompletePDF Book Library. It's free to register here to get Book file PDF Using REDUCE in High Energy Physics Pocket Guide.

This benchmark task was recently considered by experiments at the LHC 10 and the Tevatron colliders The light Higgs boson decays predominantly to a pair of bottom quarks, giving the process:. The background process, which mimics without the Higgs boson intermediate state, is the production of a pair of top quarks, each of which decay to Wb , also giving , see Fig.

In both cases, the resulting particles are two W bosons and two b -quarks. Simulated events are generated with the MadGraph5 ref. For the benchmark case here, and has been assumed. We consider events which satisfy the requirements:. Events which satisfy the requirements above are naturally described by a simple set of features which represent the basic measurements made by the particle detector: the momentum of each observed particle.

In addition, we reconstruct the missing transverse momentum in the event and have b -tagging information for each jet. Together, these twenty-one features comprise our low-level feature set. Figure 2 shows the distribution of a subset of these kinematic features for signal and background processes. Distributions in events for simulated signal black and background red benchmark events. Shown are the distributions of transverse momenta p T of each observed particle a — e as well as the imbalance of momentum in the final state f.

Momentum angular information for each observed particle is also available to the network, but is not shown, as the one-dimensional projections have little information. The low-level features show some distinguishing characteristics, but our knowledge of the different intermediate states of the two processes allows us to construct other features which better capture the differences. As the difference in the two hypotheses lies mostly in the existence of new intermediate Higgs boson states, we can distinguish between the two hypotheses by attempting to identify whether the intermediate state existed.

This is done by reconstructing its characteristic invariant mass; if a particle A decays into particles B and C, the invariant mass of particle A m A can be reconstructed as:. In the signal hypothesis we expect that:. Whereas in the case of the background we expect that:. See Fig. Clearly these contain more discrimination power than the low-level features. Distributions in simulation of invariant mass calculations in events for simulated signal black and background red events. We have published a data set containing 11 million simulated collision events for benchmarking machine-learning classification algorithms on this task, which can be found in the UCI Machine Learning Repository at archive.

The second benchmark classification task is to distinguish between a process where new supersymmetric particles SUSY are produced, leading to a final state, in which some particles are detectable and others are invisible to the experimental apparatus, and a background process with the same detectable particles but fewer invisible particles and distinct kinematic features.

An evening with the world's leaders in high energy physics

This benchmark problem is currently of great interest to the field of high-energy physics, and there is a vigorous effort in the literature 17 , 18 , 19 , 20 to build high-level features which can aid in the classification task. The classification task requires distinguishing between these two processes using the measurements of the charged lepton momenta and the missing transverse momentum.

As above, simulated events are generated with the MadGraph ref. The masses are set to and. As above, the basic detector response is used to measure the momentum of each visible particle, in this case the charged leptons. In addition, there may be particle jets induced by radiative processes. A critical quantity is the missing transverse momentum,. Figure 5 gives distributions of low-level features for signal and background processes.

Distribution of low-level features in simulated samples for the SUSY signal black and background red benchmark processes. The search for supersymmetric particles is a central piece of the scientific mission of the Large Hadron Collider. The strategy we applied to the Higgs boson benchmark, of reconstructing the invariant mass of the intermediate state, is not feasible here, as there is too much information carried away by the escaping neutrinos two neutrinos in this case, compared with one for the Higgs case.

Looking for other ways to read this?

Instead, a great deal of intellectual energy has been spent in attempting to devise features that give additional classification power. These include high-level features such as:. Axial T : missing transverse energy along the vector defined by the charged leptons,. Distribution of high-level features in simulated samples for the SUSY signal black and background red benchmark processes.

A data set containing five million simulated collision events is available for download at archive. Standard techniques in high-energy physics data analyses include feed-forward neural networks with a single hidden layer and boosted decision trees.

We use the widely-used TMVA package 21 , which provides a standardized implementation of common multivariate learning techniques and an excellent performance baseline. We explored the use of DNs as a practical tool for applications in high-energy physics. Due to computational costs, this optimization was not thorough, but included combinations of the pre-training methods, network architectures, initial learning rates and regularization methods shown in Supplementary Table 3. We selected a five-layer neural network with hidden units in each layer, a learning rate of 0. Pre-training, extra hidden units and additional hidden layers significantly increased training time without noticeably increasing performance.

To facilitate comparison, shallow neural networks were trained with the same hyper-parameters and the same number of units per hidden layer. Additional training details are provided in the Methods section below. To investigate whether the neural networks were able to learn the discriminative information contained in the high-level features, we trained separate classifiers for each of the three feature sets described above: low-level, high-level and combined feature sets. For the SUSY benchmark, the networks were trained with the same hyper-parameters chosen for the HIGGS, as the data sets have similar characteristics and the hyper-parameter search is computationally expensive.

Classifiers were tested on , simulated examples generated from the same Monte Carlo procedures as the training sets. We produced receiver operating characteristic curves to illustrate the performance of the classifiers. Our primary metric for comparison is the area under the receiver operating characteristic curve AUC , with larger AUC values indicating higher classification accuracy across a range of threshold choices.

This metric is insightful, as it is directly connected to classification accuracy, which is the quantity optimized for in training. In practice, physicists may be interested in other metrics, such as signal efficiency at some fixed background rejection or discovery significance as calculated by P -value in the null hypothesis.

We choose AUC as it is a standard in machine learning, and is closely correlated with the other metrics. In addition, we calculate discovery significance—the standard metric in high-energy physics—to demonstrate that small increases in AUC can represent significant enhancement in discovery significance.

Note, however, that in some applications the determining factor in the sensitivity to new exotic particles is determined not only by the discriminating power of the selection, but by the uncertainties in the background model itself. Some portions of the background model may be better understood than others, so that some simulated background collisions have larger associated systematic uncertainties than other collisions. This can transform the problem into one of reinforcement learning, where per-collision truth labels no longer indicate the ideal network output target.

This is beyond the scope of this study, but see refs 22 , 23 for stochastic optimizaton strategies for such problems. Figure 7 and Table 1 show the signal efficiency and background rejection for varying thresholds on the output of the neural network NN or boosted decision tree BDT.

Support my writing

For the Higgs benchmark, comparison of background rejection versus signal efficiency for the traditional learning method a and the deep learning method b using the low lo -level features, the high hi -level features and the complete set of features. A shallow NN or BDT trained using only the low-level features performs significantly worse than one trained with only the high-level features.

This implies that the shallow NN and BDT are not succeeding in independently discovering the discriminating power of the high-level features. This is a well-known problem with shallow-learning methods, and motivates the calculation of high-level features. Methods trained with only the high-level features, however, have a weaker performance than those trained with the full suite of features, which suggests that despite the insight represented by the high-level features, they do not capture all the information contained in the low-level features.

The deep-learning techniques show nearly equivalent performance using the low-level features and the complete features, suggesting that they are automatically discovering the insight contained in the high-level features. Finally, the deep-learning technique finds additional separation power beyond what is contained in the high-level features, demonstrated by the superior performance of the DN with low-level features to the traditional network using high-level features.

These results demonstrate the advantage to using deep-learning techniques for this type of problem.


  • A+ Certification for Dummies, 3rd edition!
  • Open-source software for data from high-energy physics!
  • Why Supersymmetry May Be The Greatest Failed Prediction In Particle Physics History.

The internal representation of an NN is notoriously difficult to reverse engineer. The NN preferentially selects events with values of the features close to the characteristic signal values and away from background-dominated values. The DN, which has a higher efficiency for the equivalent rejection, selects events near the same signal values, but also retains events away from the signal-dominated region.

The likely explanation is that the DN has discovered the same signal-rich region identified by the physics features, but has in addition found avenues to carve into the background-dominated region. Distribution of events for two rescaled input features: a m Wbb and b m WWbb. The improvement is less dramatic, though statistically significant. Table 2 , Supplementary Figs 10 and 11 compare the performance of shallow and DNs for each of the three sets of input features. In this SUSY case, neither the high-level features nor the DN finds dramatic gains over the shallow network of low-level features.

The power of the DN to automatically find nonlinear features reveals something about the nature of the classification problem in this case: it suggests that there may be little gain from further attempts to manually construct high-level features. To highlight the advantage of DNs over shallow networks with a similar number of parameters, we performed a thorough hyper-parameter optimization for the class of single-layer neural networks over the hyper-parameters specified in Supplementary Table 4 on the HIGGS benchmark.

The largest shallow network had , parameters, slightly more than the , parameters in the largest DN, but these additional hidden units did very little to increase performance over a shallow network with only 30, parameters. Supplementary Table 5 compares the performance of the best shallow networks of each size with DNs of varying depth. Although the primary advantage of DNs is their ability to automatically learn high-level features from the data, one can imagine facilitating this process by pre-training a neural network to compute a particular set of high-level features.

Note that such a network could be used as a module within a larger neural network classifier. It is widely accepted in experimental high-energy physics that machine-learning techniques can provide powerful boosts to searches for exotic particles. Until now, physicists have reluctantly accepted the limitations of the shallow networks employed to date; in an attempt to circumvent these limitations, physicists manually construct helpful nonlinear feature combinations to guide the shallow networks.

Our analysis shows that recent advances in deep-learning techniques may lift these limitations by automatically discovering powerful nonlinear feature combinations and providing better discrimination power than current classifiers—even when aided by manually constructed features. This appears to be the first such demonstration in a semi-realistic case.

We suspect that the novel environment of high-energy physics, with high volumes of relatively low-dimensional data containing rare signals hiding under enormous backgrounds, can inspire new developments in machine-learning tools. Beyond these simple benchmarks, deep-learning methods may be able to tackle thornier problems with multiple backgrounds, or lower-level tasks such as identifying the decay products from the high-dimensional raw detector output.

In training the neural networks, the following hyper-parameters were predetermined without optimization. Hidden units all used the tanh activation function. Weights were initialized from a normal distribution with zero mean and standard deviation 0. Gradient computations were made on mini-batches of size A momentum term increased linearly over the first epochs from 0. The learning rate decayed by a factor of 1.

Training ended when the momentum had reached its maximum value and the minimum error on the validation set , examples had not decreased by more than a factor of 0. This early stopping prevented overfitting and resulted in each neural network being trained for —1, epochs. Autoencoder pre-training was performed by training a stack of single-hidden-layer autoencoder networks as in ref. Each autoencoder in the stack used tanh hidden units and linear outputs, and was trained with the same initialization scheme, learning algorithm and stopping parameters as in the fine-tuning stage.

When training with dropout, we increased the learning rate decay factor to 1.

High Energy Physics at the University of Illinois at Urbana-Champaign

All neural networks were trained using the GPU-accelerated Theano and Pylearn2 software libraries 24 , How to cite this article: Baldi, P. Searching for exotic particles in high-energy physics with deep learning. Dawson, S. Neyman, J. On the problem of the most efficient tests of statistical hypotheses. A , — Hornik, K.

Multilayer feedforward networks are universal approximators. Neural Netw. Hochreiter, S. Recurrent neural net learning and vanishing gradient. Bengio, Y. Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Hinton, G. Improving neural networks by preventing co-adaptation of feature detectors. Baldi, P. The dropout learning algorithm. A fast learning algorithm for deep belief nets. Neural Comput. Aad, G. Candidates must have completed a PhD in a relevant discipline by the start date of the appointment, must demonstrate the ability to establish and maintain a vibrant program of research excellence, and show the potential for outstanding teaching at both the undergraduate and graduate levels.

The successful candidate will be an outstanding scientist with a record of research excellence in theoretical high energy physics, very broadly defined; will excel in teaching at the undergraduate and graduate level; and, is expected to make substantive contributions through service to both institutions, as well as to the broader community. The appointment will be at the Assistant Professor level at McMaster and as an Associate Faculty member at Perimeter Institute this includes research support from PI as well as a commensurate reduction in teaching and service requirements at McMaster.

Progressive policies are in place to assist faculty members who become parents or are needed to care for family. Salary will be commensurate with qualifications and experience. McMaster University is a globally renowned institution of higher learning and research, within which High Energy Theory is a small but growing part. More information about the Department can be found here. Perimeter Institute is a leading global centre for fundamental research in theoretical physics.