Wow! eBook: Doing Data Science - 5 new eBooks

Doing Data Science
Agile Data Science
Mining the Social Web, 2nd Edition
Python and HDF5
Test-Driven Infrastructure with Chef, 2nd Edition

Posted: 09 Nov 2013 08:27 AM PST

Book Description

Now that people are aware that data can make the difference in an election or a business model, data science as an occupation is gaining ground. But how can you get started working in a wide-ranging, interdisciplinary field that's so clouded in hype? This insightful book, based on Columbia University's Introduction to Data Science class, tells you what you need to know.

In many of these chapter-long lectures, data scientists from companies such as Google, Microsoft, and eBay share new algorithms, methods, and models by presenting case studies and the code they use. If you're familiar with linear algebra, probability, and statistics, and have programming experience, this book is an ideal introduction to data science.

Topics include:

Statistical inference, exploratory data analysis, and the data science process
Algorithms
Spam filters, Naive Bayes, and data wrangling
Logistic regression
Financial modeling
Recommendation engines and causality
Data visualization
Social networks and data journalism
Data engineering, MapReduce, Pregel, and Hadoop

Doing Data Science is collaboration between course instructor Rachel Schutt, Senior VP of Data Science at News Corp, and data science consultant Cathy O'Neil, a senior data scientist at Johnson Research Labs, who attended and blogged about the course.

Table of Contents
Chapter 1. Introduction: What Is Data Science?
Chapter 2. Statistical Inference, Exploratory Data Analysis, and the Data Science Process
Chapter 3. Algorithms
Chapter 4. Spam Filters, Naive Bayes, and Wrangling
Chapter 5. Logistic Regression
Chapter 6. Time Stamps and Financial Modeling
Chapter 7. Extracting Meaning from Data
Chapter 8. Recommendation Engines: Building a User-Facing Data Product at Scale
Chapter 9. Data Visualization and Fraud Detection
Chapter 10. Social Networks and Data Journalism
Chapter 11. Causality
Chapter 12. Epidemiology
Chapter 13. Lessons Learned from Data Competitions: Data Leakage and Model Evaluation
Chapter 14. Data Engineering: MapReduce, Pregel, and Hadoop
Chapter 15. The Students Speak
Chapter 16. Next-Generation Data Scientists, Hubris, and Ethics

Book Details

Paperback: 406 pages
Publisher: O’Reilly Media (October 2013)
Language: English
ISBN-10: 1449358659
ISBN-13: 978-1449358655

Note: There is a file embedded within this post, please visit this post to download the file.

Related Books

Programming Massively Parallel Processors, 2nd Edition (24-05-2013)
Processing, 2nd Edition (19-06-2013)
Engineering a Compiler, 2nd Edition (09-04-2013)
Think Bayes (02-11-2013)
Machine Learning for Hackers (12-03-2012)
Learning R (31-10-2013)
Heterogeneous Computing with OpenCL, 2nd Edition (22-05-2013)
Hadoop: Beginner's Guide (24-08-2013)
Hadoop Real-World Solutions Cookbook (06-04-2013)
Hadoop in Practice (26-10-2012)
Data Points (24-04-2013)
Data Mining: Concepts and Techniques, 3rd Edition (09-04-2013)
Computer Animation, 3rd Edition (09-04-2013)
Coding Interviews (08-02-2013)
Analyzing the Analyzers (04-08-2013)
Using the TI-83 Plus/TI-84 Plus (18-10-2013)
The Art of Multiprocessor Programming, Revised Reprint (20-05-2013)
The Art of Concurrency (16-10-2009)
Structured Parallel Programming (12-04-2013)
Statistical Analysis with Excel For Dummies, 3rd Edition (23-04-2013)

The post Doing Data Science appeared first on Wow! eBook.

Agile Data Science

Posted: 09 Nov 2013 08:23 AM PST

Book Description

Mining big data requires a deep investment in people and time. How can you be sure you're building the right models? With this hands-on book, you'll learn a flexible toolset and methodology for building effective analytics applications with Hadoop.

Using lightweight tools such as Python, Apache Pig, and the D3.js library, your team will create an agile environment for exploring data, starting with an example application to mine your own email inboxes. You'll learn an iterative approach that enables you to quickly change the kind of analysis you're doing, depending on what the data is telling you. All example code in this book is available as working Heroku apps.

Create analytics applications by using the agile big data development methodology
Build value from your data in a series of agile sprints, using the data-value stack
Gain insight by using several data structures to extract multiple features from a single dataset
Visualize data with charts, and expose different aspects through interactive reports
Use historical data to predict the future, and translate predictions into action
Get feedback from users after each sprint to keep your project on track

Table of Contents
Part I: Setup
Chapter 1. Theory
Chapter 2. Data
Chapter 3. Agile Tools
Chapter 4. To the Cloud!

Part II: Climbing the Pyramid
Chapter 5. Collecting and Displaying Records
Chapter 6. Visualizing Data with Charts
Chapter 7. Exploring Data with Reports
Chapter 8. Making Predictions
Chapter 9. Driving Actions

Book Details

Paperback: 178 pages
Publisher: O’Reilly Media (October 2013)
Language: English
ISBN-10: 1449326269
ISBN-13: 978-1449326265

Note: There is a file embedded within this post, please visit this post to download the file.

Related Books

Hadoop Real-World Solutions Cookbook (06-04-2013)
TeamCity 7 Continous Integration (26-01-2013)
Software Requirements 3, 3rd Edition (14-09-2013)
Programming for PaaS (18-09-2013)
Professional Heroku Programming (06-05-2013)
Managing Data in Motion (27-05-2013)
Making Sense of NoSQL (18-10-2013)
Learning IPython for Interactive Computing and Data Visualization (09-09-2013)
Implementing Splunk (26-02-2013)
From Techie to Boss (18-06-2013)
Collaborative Enterprise Architecture (22-05-2013)
Cloud Computing: Theory and Practice (05-09-2013)
Big Data For Dummies (23-04-2013)
Apache Solr 4 Cookbook (02-03-2013)
WordPress All-in-One For Dummies, 2nd Edition (06-07-2013)
What's New in SQL Server 2012 (12-11-2012)
Web Services, Service-Oriented Architectures, and Cloud Computing, 2nd Edition (24-05-2013)
Visual Studio Lightswitch 2012 (12-10-2013)
Visual Intelligence (23-07-2013)
Think Bayes (02-11-2013)

The post Agile Data Science appeared first on Wow! eBook.

Mining the Social Web, 2nd Edition

Posted: 09 Nov 2013 08:19 AM PST

Book Description

How can you tap into the wealth of social web data to discover who's making connections with whom, what they're talking about, and where they're located? With this expanded and thoroughly revised edition, you'll learn how to acquire, analyze, and summarize data from all corners of the social web, including Facebook, Twitter, LinkedIn, Google+, GitHub, email, websites, and blogs.

Employ the Natural Language Toolkit, NetworkX, and other scientific computing tools to mine popular social web sites
Apply advanced text-mining techniques, such as clustering and TF-IDF, to extract meaning from human language data
Bootstrap interest graphs from GitHub by discovering affinities among people, programming languages, and coding projects
Build interactive visualizations with D3.js, an extraordinarily flexible HTML5 and JavaScript toolkit
Take advantage of more than two-dozen Twitter recipes, presented in O'Reilly's popular “problem/solution/discussion” cookbook format

The example code for this unique data science book is maintained in a public GitHub repository. It's designed to be easily accessible through a turnkey virtual machine that facilitates interactive learning with an easy-to-use collection of IPython Notebooks.

Table of Contents
Part I: A Guided Tour of the Social Web
Chapter 1. Mining Twitter: Exploring Trending Topics, Discovering What People Are Talking About, and More
Chapter 2. Mining Facebook: Analyzing Fan Pages, Examining Friendships, and More
Chapter 3. Mining LinkedIn: Faceting Job Titles, Clustering Colleagues, and More
Chapter 4. Mining Google+: Computing Document Similarity, Extracting Collocations, and More
Chapter 5. Mining Web Pages: Using Natural Language Processing to Understand Human Language, Summarize Blog Posts, and More
Chapter 6. Mining Mailboxes: Analyzing Who's Talking to Whom About What, How Often, and More
Chapter 7. Mining GitHub: Inspecting Software Collaboration Habits, Building Interest Graphs, and More
Chapter 8. Mining the Semantically Marked-Up Web: Extracting Microformats, Inferencing over RDF, and More

Part II: Twitter Cookbook
Chapter 9. Twitter Cookbook

Part III: Appendixes
Appendix A. Information About This Book's Virtual Machine Experience
Appendix B. OAuth Primer
Appendix C. Python and IPython Notebook Tips & Tricks

Book Details

Paperback: 448 pages
Publisher: O’Reilly Media; 2nd Edition (October 2013)
Language: English
ISBN-10: 1449367615
ISBN-13: 978-1449367619

Note: There is a file embedded within this post, please visit this post to download the file.

Related Books

Mining the Social Web (20-01-2011)
Social Media Engagement For Dummies (06-07-2013)
Instant Spring for Android Starter (28-02-2013)
Social Networking Spaces (25-03-2010)
iOS 6 Recipes (01-02-2013)
Get Up to Speed with Online Marketing (31-07-2012)
Developer's Guide to Social Programming (07-10-2010)
Branding Yourself, 2nd Edition (02-10-2012)
Advanced Social Media Marketing (05-02-2013)
Xcode 4 Cookbook (25-09-2013)
Think Before You Engage (08-06-2012)
The New Community Rules (24-07-2009)
Ruby on Rails Tutorial, 2nd Edition (29-09-2012)
Programming the Mobile Web, 2nd Edition (19-04-2013)
Opa: Up and Running (09-03-2013)
Opa Application Development (28-09-2013)
NOOK HD: The Missing Manual, 2nd Edition (07-03-2013)
Magento Mobile How-to (22-02-2013)
iPod and iTunes For Dummies, 10th Edition (06-03-2013)
Instant PhoneGap Social App Development (28-02-2013)

The post Mining the Social Web, 2nd Edition appeared first on Wow! eBook.

Python and HDF5

Posted: 09 Nov 2013 08:14 AM PST

Book Description

Gain hands-on experience with HDF5 for storing scientific data in Python. This practical guide quickly gets you up to speed on the details, best practices, and pitfalls of using HDF5 to archive and share numerical datasets ranging in size from gigabytes to terabytes.

Through real-world examples and practical exercises, you'll explore topics such as scientific datasets, hierarchically organized groups, user-defined metadata, and interoperable files. Examples are applicable for users of both Python 2 and Python 3. If you're familiar with the basics of Python data analysis, this is an ideal introduction to HDF5.

Get set up with HDF5 tools and create your first HDF5 file
Work with datasets by learning the HDF5 Dataset object
Understand advanced features like dataset chunking and compression
Learn how to work with HDF5's hierarchical structure, using groups
Create self-describing files by adding metadata with HDF5 attributes
Take advantage of HDF5's type system to create interoperable files
Express relationships among data with references, named types, and dimension scales
Discover how Python mechanisms for writing parallel code interact with HDF5

Table of Contents
Chapter 1. Introduction
Chapter 2. Getting Started
Chapter 3. Working with Datasets
Chapter 4. How Chunking and Compression Can Help You
Chapter 5. Groups, Links, and Iteration: The "H" in HDF5
Chapter 6. Storing Metadata with Attributes
Chapter 7. More About Types
Chapter 8. Organizing Data with References, Types, and Dimension Scales
Chapter 9. Concurrency: Parallel HDF5, Threading, and Multiprocessing
Chapter 10. Next Steps

Book Details

Paperback: 152 pages
Publisher: O’Reilly Media (October 2013)
Language: English
ISBN-10: 1449367836
ISBN-13: 978-1449367831

Note: There is a file embedded within this post, please visit this post to download the file.

Related Books

Learning IPython for Interactive Computing and Data Visualization (09-09-2013)
Think Bayes (02-11-2013)
Python Cookbook, 3rd Edition (09-06-2013)
Programming Massively Parallel Processors, 2nd Edition (24-05-2013)
OpenCV Computer Vision with Python (09-09-2013)
NumPy: Beginner's Guide, 2nd Edition (03-09-2013)
Learning R (31-10-2013)
Introducing Geographic Information Systems with ArcGIS, 3rd Edition (23-07-2013)
Instant InnoDB (05-04-2013)
Heterogeneous Computing with OpenCL, 2nd Edition (22-05-2013)
Excel 2013: The Missing Manual (15-05-2013)
Agile Data Science (09-11-2013)
ZeroMQ (17-04-2013)
Windows Server 2012 Inside Out (11-02-2013)
Windows Server 2012 Automation with PowerShell Cookbook (28-08-2013)
Windows Forensic Analysis Toolkit, 3rd Edition (10-05-2013)
Web Audio API (15-04-2013)
Visual Intelligence (23-07-2013)
Virtualization For Dummies (03-07-2013)
Twisted Network Programming Essentials, 2nd Edition (17-04-2013)

The post Python and HDF5 appeared first on Wow! eBook.

Test-Driven Infrastructure with Chef, 2nd Edition

Posted: 09 Nov 2013 08:08 AM PST

Book Description

Since Test-Driven Infrastructure with Chef first appeared in mid-2011, infrastructure testing has begun to flourish in the web ops world. In this revised and expanded edition, author Stephen Nelson-Smith brings you up to date on this rapidly evolving discipline, including the philosophy driving it and a growing array of tools. You'll get a hands-on introduction to the Chef framework, and a recommended toolchain and workflow for developing your own test-driven production infrastructure.

Several exercises and examples throughout the book help you gain experience with Chef and the entire infrastructure-testing ecosystem. Learn how this test-first approach provides increased security, code quality, and peace of mind.

Explore the underpinning philosophy that infrastructure can and should be treated as code
Become familiar with the MASCOT approach to test-driven infrastructure
Understand the basics of test-driven and behavior-driven development for managing change
Dive into Chef fundamentals by building an infrastructure with real examples
Discover how Chef works with tools such as Virtualbox and Vagrant
Get a deeper understanding of Chef by learning Ruby language basics
Learn the tools and workflow necessary to conduct unit, integration, and acceptance tests

Table of Contents
Chapter 1. The Philosophy of Test-Driven Infrastructure
Chapter 2. An Introduction to Ruby
Chapter 3. An Introduction to Chef
Chapter 4. Using Chef with Tools
Chapter 5. An Introduction to Test- and Behavior-Driven Development
Chapter 6. A Test-Driven Infrastructure Framework
Chapter 7. Test-Driven Infrastructure: A Recommended Toolchain
Chapter 8. Epilogue

Book Details

Paperback: 308 pages
Publisher: O’Reilly Media; 2nd Edition (October 2013)
Language: English
ISBN-10: 1449372201
ISBN-13: 978-1449372200

Note: There is a file embedded within this post, please visit this post to download the file.

Related Books

The Definitive Guide to Grails 2 (08-02-2013)
RubyMotion (16-02-2013)
JavaScript Testing with Jasmine (19-04-2013)
Beginning Rails 4, 3rd Edition (15-10-2013)
Vagrant: Up and Running (10-06-2013)
Vaadin 7 Cookbook (11-09-2013)
UX for Lean Startups (08-06-2013)
Third-Party JavaScript (04-04-2013)
The User Experience Team of One (02-08-2013)
Testable JavaScript (12-02-2013)
Spring in Practice (16-07-2013)
Single Page Web Applications (19-10-2013)
Sencha Touch 2: Up and Running (09-03-2013)
Secrets of the JavaScript Ninja (04-04-2013)
Scala in Action (16-07-2013)
Programming Ruby 1.9 & 2.0, 4th Edition (22-08-2013)
Programming Massively Parallel Processors, 2nd Edition (24-05-2013)
Professional Heroku Programming (06-05-2013)
Pro Team Foundation Service (21-06-2013)
Pro ASP.NET MVC 4, 4th Edition (08-02-2013)

The post Test-Driven Infrastructure with Chef, 2nd Edition appeared first on Wow! eBook.

Programming Ebook Guide

Sabtu, 09 November 2013