Computational and statistical methods for analysing big data with applications / [electronic resource]
by Liu, Shen [author.]; McGree, James [author.]; Ge, Zongyuan [author.]; Xie, Yang [author.].
Material type: BookPublisher: London : Academic Press, 2016.Description: 1 online resource (viii, 194 pages) : illustrations (some color).ISBN: 9780081006511; 0081006519.Subject(s): Big data | Quantitative research | Quantitative research -- Statistical methods | Data mining -- Statistical methods | MATHEMATICS -- Applied | MATHEMATICS -- Probability & Statistics -- General | Big data | Electronic booksOnline resources: ScienceDirectOnline resource; title from PDF title page (EBSCO, viewed December 3, 2015).
"Academic Press is an imprint of Elsevier."
Includes bibliographical references and index.
Front Cover; Computational and Statistical Methods for Analysing Big Data with Applications; Copyright Page; Contents; List of Figures; List of Tables; Acknowledgment; 1 Introduction; 1.1 What is big data?; 1.1.1 Volume; 1.1.2 Velocity; 1.1.3 Variety; 1.1.4 Another two V's; 1.2 What is this book about?; 1.3 Who is the intended readership?; References; 2 Classification methods; 2.1 Fundamentals of classification; 2.1.1 Features and training samples; Example: Discriminating owners from non-owners of riding mowers; 2.1.2 Probabilities of misclassification and the associated costs.
2.1.3 Classification by minimizing the ECMExample: Medical diagnosis; 2.1.4 More than two classes; 2.2 Popular classifiers for analysing big data; 2.2.1 k-Nearest neighbour algorithm; 2.2.2 Regression models; 2.2.3 Bayesian networks; 2.2.4 Artificial neural networks; 2.2.5 Decision trees; 2.3 Summary; References; 3 Finding groups in data; 3.1 Principal component analysis; 3.2 Factor analysis; 3.3 Cluster analysis; 3.3.1 Hierarchical clustering procedures; 3.3.2 Nonhierarchical clustering procedures; 3.3.3 Deciding on the number of clusters; 3.4 Fuzzy clustering; Appendix.
R code for principal component analysis and factor analysisMATLAB code for cluster analysis; References; 4 Computer vision in big data applications; 4.1 Big datasets for computer vision; 4.2 Machine learning in computer vision; 4.2.1 Feature engineering; 4.2.2 Classifiers; Regression; Support vector machine; Gaussian mixture models; 4.3 State-of-the-art methodology: deep learning; 4.3.1 A single-neuron model; 4.3.2 A multilayer neural network; 4.3.3 Training process of multilayer neural networks; Feed-forward pass; Back-propagation pass; 4.4 Convolutional neural networks; 4.4.1 Pooling.
4.4.2 Training a CNN4.4.3 An example of CNN in image recognition; Overall structure of the network; Data preprocessing; Prevention of overfitting; 4.5 A tutorial: training a CNN by ImageNet; 4.5.1 Caffe; 4.5.2 Architecture of the network; Input layer; Convolutional layer; Pooling layer; LRN layer; Fully-connected layers; Dropout layers; Softmax layer; 4.5.3 Training; 4.6 Big data challenge: ILSVRC; 4.6.1 Performance evaluation; 4.6.2 Winners in the history of ILSVRC; 4.7 Concluding remarks: a comparison between human brains and computers; Acknowledgements; References.
5 A computational method for analysing large spatial datasets5.1 Introduction to spatial statistics; 5.1.1 Spatial dependence; 5.1.2 Cross-variable dependence; 5.1.3 Limitations of conventional approaches to spatial analysis; 5.2 The HOS method; 5.2.1 Cross-variable high-order statistics; 5.2.2 Searching process; 5.2.3 Local CPDF approximation; 5.3 MATLAB functions for the implementation of the HOS method; 5.3.1 Spatial template and searching process; 5.3.2 Higher-order statistics; 5.3.3 Coefficients of Legendre polynomials; 5.3.4 CPDF approximation; 5.4 A case study; References.
Due to the scale and complexity of data sets currently being collected in areas such as health, transportation, environmental science, engineering, information technology, business and finance, modern quantitative analysts are seeking improved and appropriate computational and statistical methods to explore, model and draw inferences from big data. This book aims to introduce suitable approaches for such endeavours, providing applications and case studies for the purpose of demonstration. Computational and Statistical Methods for Analysing Big Data with Applications starts with an overview of the era of big data. It then goes onto explain the computational and statistical methods which have been commonly applied in the big data revolution. For each of these methods, an example is provided as a guide to its application. Five case studies are presented next, focusing on computer vision with massive training data, spatial data analysis, advanced experimental design methods for big data, big data in clinical medicine, and analysing data collected from mobile devices, respectively. The book concludes with some final thoughts and suggested areas for future research in big data.
There are no comments for this item.