000 06313cam a2200733 i 4500
001 ocn889941400
003 OCoLC
005 20171026112149.0
006 m o d
007 cr |||||||||||
008 140902s2014 enk ob 001 0 eng
010 _a 2014035023
015 _aGBB4D4768
_2bnb
016 7 _a016955944
_2Uk
020 _a9781118834787
_qelectronic bk.
020 _a111883478X
_qelectronic bk.
020 _a9781118834800
_qelectronic bk.
020 _a1118834801
_qelectronic bk.
020 _z9781118834817 (hardback)
020 _z9781118834732
020 _z1118834739
029 1 _aNLGGC
_b383512123
029 1 _aAU@
_b000053665594
029 1 _aNZ1
_b15913639
029 1 _aCHVBK
_b334092922
029 1 _aCHBIS
_b010442318
029 1 _aAU@
_b000058372878
029 1 _aDEBBG
_bBV043397096
029 1 _aCHVBK
_b375152725
029 1 _aCHSLU
_b001259237
035 _a(OCoLC)889941400
037 _aAC8C3FD3-C56D-49B4-B991-612E67C28A99
_bOverDrive, Inc.
_nhttp://www.overdrive.com
040 _aDLC
_beng
_erda
_cDLC
_dN$T
_dUKMGB
_dYDXCP
_dDG1
_dCDX
_dRECBK
_dOCLCF
_dOCLCO
_dTEFOD
_dEBLCP
_dOCLCQ
_dDEBBG
042 _apcc
049 _aMAIN
050 0 0 _aQA76.9.D343
072 7 _aCOM
_x000000
_2bisacsh
082 0 0 _a006.3/12
_223
084 _aCOM021030
_2bisacsh
100 1 _aMunzert, Simon.
245 1 0 _aAutomated data collection with R : a practical guide to Web scraping and text mining /
_cSimon Munzert, Christian Ruoba, Peter Meiboner, Dominic Nyhuis.
_h[electronic resource]
264 1 _aChichester, West Sussex, United Kingdom ;
_a:
_bWiley,
_c2014.
300 _a1 online resource.
336 _atext
_2rdacontent
337 _acomputer
_2rdamedia
338 _aonline resource
_2rdacarrier
504 _aIncludes bibliographical references and index.
505 8 _aMachine generated contents note: Dedication Table of Contents List of Figures List of Tables Preface 1 Introduction 1.1 Case Study: World Heritage Sites in Danger 1.2 Some Remarks on Web Data Quality 1.3 Technologies for Disseminating, Extracting and Storing Web Data 1.3.1 Technologies for disseminating content on the Web 1.4 Structure of the Book Part One A Primer on Web and Data Technologies 2 HTML 2.1 Browser Presentation and Source Code 2.2 Syntax Rules 2.3 Tags and Attributes 2.4 Parsing Summary Further Reading Problems 3 XML and JSON 3.1 A Short Example XML Document 3.2 XML Syntax Rules 3.3 When Is an XML Document Well-formed or Valid? 3.4 XML Extensions and Technologies 3.5 XML and R in Practice 3.6 A Short Example JSON Document 3.7 JSON Syntax Rules 3.8 JSON and R in Practice Summary Further Reading Problems 4 XPath 4.1 XPath -- a Querying Language for Web Documents 4.2 Identifying Node Sets with XPath 4.3 Extracting Node Elements Summary Further Reading Problems 5 HTTP 5.1 HTTP Fundamentals 5.2 Advanced Features of HTTP 5.3 Protocols beyond HTTP 5.4 HTTP in Action Summary Further Reading Problems 6 AJAX 6.1 JavaScript 6.2 XHR 6.3 Exploring AJAX with Web Developer Tools Summary Further Reading Problems 7 SQL and Relational Databases 7.1 Overview and Terminology 7.2 Relational Databases 7.3 SQL: a Language to Communicate with Databases 7.4 Databases in Action Summary Further Reading Problems 8 Regular Expressions and String Functions 8.1 Regular Expressions 8.2 String Processing 8.3 A Word on Character Encodings Summary Further Reading Problems Part Two A Practical Toolbox for Web Scraping and Text Mining 9 Scraping the Web 9.1 Retrieval Scenarios 9.2 Extraction Strategies 9.3 Web Scraping: Good Practice 9.4 Valuable Sources of Inspiration Summary Further Reading Problems 10 Statistical Text Processing 10.1 The running example: classifying press releases of the British government 10.2 Processing Textual Data 10.3 Supervised Learning Techniques 10.4 Unsupervised Learning Techniques Summary Further reading 11 Managing Data Projects 11.1 Interacting with the File System 11.2 Processing Multiple Documents/Links 11.3 Organizing Scraping Procedures 11.4 Executing R Scripts on a Regular Basis Part Three A Bag of Case Studies 12 Collaboration Networks in the U.S. Senate 12.1 Information on the Bills 12.2 Information on the Senators 12.3 Analyzing the network structure 12.4 Conclusion 13 Parsing Information from Semi-Structured Documents 13.1 Downloding Data from the FTP Server 13.2 Parsing Semi-Structured Text Data 13.3 Visualizing station and temperature data 14 Predicting the 2014 Academy Awards using Twitter 14.1 Twitter APIs: Overview 14.2 Twitter-based Forecast of the 2014 Academy Awards 14.3 Conclusion 15 Mapping the Geographic Distribution of Names 15.1 Developing a Data Collection Strategy 15.2 Web Site Inspection 15.3 Data Retrieval and Information Extraction 15.4 Mapping Names 15.5 Automating the Process 15.6 Summary 16 Gathering Data on Mobile Phones 16.1 Page Exploration 16.2 Scraping Procedure 16.3 Graphical Analysis 16.4 Data storage 17 Analyzing Sentiments of Product Reviews 17.1 Introduction 17.2 Collecting the data 17.3 Analyzing the Data 17.4 Conclusion References Bibliography Indices General Index Package Index Function Index .
520 _a"This book provides a unified framework of web scraping and information extraction from text data with R for the social sciences"--
_cProvided by publisher.
588 _aDescription based on print version record and CIP data provided by publisher.
650 0 _aData mining.
650 0 _aAutomatic data collection systems.
650 0 _aSocial sciences
_xResearch
_xData processing.
650 0 _aR (Computer program language)
650 7 _aCOMPUTERS / Database Management / Data Mining.
_2bisacsh
650 7 _aAutomatic data collection systems.
_2fast
_0(OCoLC)fst00822733
650 7 _aData mining.
_2fast
_0(OCoLC)fst00887946
650 7 _aR (Computer program language)
_2fast
_0(OCoLC)fst01086207
650 7 _aSocial sciences
_xResearch
_xData processing.
_2fast
_0(OCoLC)fst01122948
655 4 _aElectronic books.
776 0 8 _iPrint version:
_aMunzert, Simon.
_tAutomated data collection with R
_dHobokenChichester, West Sussex, United Kingdom ; : John Wiley & Sons Inc., 2014
_z9781118834817
_w(DLC) 2014032266
856 4 0 _uhttp://onlinelibrary.wiley.com/book/10.1002/9781118834732
_zWiley Online Library
942 _2ddc
_cBK
999 _c207650
_d207650