Sabtu, 09 November 2013

Wow! eBook: Doing Data Science - 5 new eBooks


Wow! eBook: Doing Data Science - 5 new eBooks

Link to Wow! eBook

Doing Data Science

Posted: 09 Nov 2013 08:27 AM PST

Book Description

Now that people are aware that data can make the difference in an election or a business model, data science as an occupation is gaining ground. But how can you get started working in a wide-ranging, interdisciplinary field that's so clouded in hype? This insightful book, based on Columbia University's Introduction to Data Science class, tells you what you need to know.

In many of these chapter-long lectures, data scientists from companies such as Google, Microsoft, and eBay share new algorithms, methods, and models by presenting case studies and the code they use. If you're familiar with linear algebra, probability, and statistics, and have programming experience, this book is an ideal introduction to data science.

Topics include:

  • Statistical inference, exploratory data analysis, and the data science process
  • Algorithms
  • Spam filters, Naive Bayes, and data wrangling
  • Logistic regression
  • Financial modeling
  • Recommendation engines and causality
  • Data visualization
  • Social networks and data journalism
  • Data engineering, MapReduce, Pregel, and Hadoop

Doing Data Science is collaboration between course instructor Rachel Schutt, Senior VP of Data Science at News Corp, and data science consultant Cathy O'Neil, a senior data scientist at Johnson Research Labs, who attended and blogged about the course.

Table of Contents
Chapter 1. Introduction: What Is Data Science?
Chapter 2. Statistical Inference, Exploratory Data Analysis, and the Data Science Process
Chapter 3. Algorithms
Chapter 4. Spam Filters, Naive Bayes, and Wrangling
Chapter 5. Logistic Regression
Chapter 6. Time Stamps and Financial Modeling
Chapter 7. Extracting Meaning from Data
Chapter 8. Recommendation Engines: Building a User-Facing Data Product at Scale
Chapter 9. Data Visualization and Fraud Detection
Chapter 10. Social Networks and Data Journalism
Chapter 11. Causality
Chapter 12. Epidemiology
Chapter 13. Lessons Learned from Data Competitions: Data Leakage and Model Evaluation
Chapter 14. Data Engineering: MapReduce, Pregel, and Hadoop
Chapter 15. The Students Speak
Chapter 16. Next-Generation Data Scientists, Hubris, and Ethics

Book Details

  • Paperback: 406 pages
  • Publisher: O’Reilly Media (October 2013)
  • Language: English
  • ISBN-10: 1449358659
  • ISBN-13: 978-1449358655
Note: There is a file embedded within this post, please visit this post to download the file.

Related Books

The post Doing Data Science appeared first on Wow! eBook.

Agile Data Science

Posted: 09 Nov 2013 08:23 AM PST

Book Description

Mining big data requires a deep investment in people and time. How can you be sure you're building the right models? With this hands-on book, you'll learn a flexible toolset and methodology for building effective analytics applications with Hadoop.

Using lightweight tools such as Python, Apache Pig, and the D3.js library, your team will create an agile environment for exploring data, starting with an example application to mine your own email inboxes. You'll learn an iterative approach that enables you to quickly change the kind of analysis you're doing, depending on what the data is telling you. All example code in this book is available as working Heroku apps.

  • Create analytics applications by using the agile big data development methodology
  • Build value from your data in a series of agile sprints, using the data-value stack
  • Gain insight by using several data structures to extract multiple features from a single dataset
  • Visualize data with charts, and expose different aspects through interactive reports
  • Use historical data to predict the future, and translate predictions into action
  • Get feedback from users after each sprint to keep your project on track

Table of Contents
Part I: Setup
Chapter 1. Theory
Chapter 2. Data
Chapter 3. Agile Tools
Chapter 4. To the Cloud!

Part II: Climbing the Pyramid
Chapter 5. Collecting and Displaying Records
Chapter 6. Visualizing Data with Charts
Chapter 7. Exploring Data with Reports
Chapter 8. Making Predictions
Chapter 9. Driving Actions

Book Details

  • Paperback: 178 pages
  • Publisher: O’Reilly Media (October 2013)
  • Language: English
  • ISBN-10: 1449326269
  • ISBN-13: 978-1449326265
Note: There is a file embedded within this post, please visit this post to download the file.

Related Books

The post Agile Data Science appeared first on Wow! eBook.

Mining the Social Web, 2nd Edition

Posted: 09 Nov 2013 08:19 AM PST

Book Description

How can you tap into the wealth of social web data to discover who's making connections with whom, what they're talking about, and where they're located? With this expanded and thoroughly revised edition, you'll learn how to acquire, analyze, and summarize data from all corners of the social web, including Facebook, Twitter, LinkedIn, Google+, GitHub, email, websites, and blogs.

  • Employ the Natural Language Toolkit, NetworkX, and other scientific computing tools to mine popular social web sites
  • Apply advanced text-mining techniques, such as clustering and TF-IDF, to extract meaning from human language data
  • Bootstrap interest graphs from GitHub by discovering affinities among people, programming languages, and coding projects
  • Build interactive visualizations with D3.js, an extraordinarily flexible HTML5 and JavaScript toolkit
  • Take advantage of more than two-dozen Twitter recipes, presented in O'Reilly's popular “problem/solution/discussion” cookbook format

The example code for this unique data science book is maintained in a public GitHub repository. It's designed to be easily accessible through a turnkey virtual machine that facilitates interactive learning with an easy-to-use collection of IPython Notebooks.

Table of Contents
Part I: A Guided Tour of the Social Web
Chapter 1. Mining Twitter: Exploring Trending Topics, Discovering What People Are Talking About, and More
Chapter 2. Mining Facebook: Analyzing Fan Pages, Examining Friendships, and More
Chapter 3. Mining LinkedIn: Faceting Job Titles, Clustering Colleagues, and More
Chapter 4. Mining Google+: Computing Document Similarity, Extracting Collocations, and More
Chapter 5. Mining Web Pages: Using Natural Language Processing to Understand Human Language, Summarize Blog Posts, and More
Chapter 6. Mining Mailboxes: Analyzing Who's Talking to Whom About What, How Often, and More
Chapter 7. Mining GitHub: Inspecting Software Collaboration Habits, Building Interest Graphs, and More
Chapter 8. Mining the Semantically Marked-Up Web: Extracting Microformats, Inferencing over RDF, and More

Part II: Twitter Cookbook
Chapter 9. Twitter Cookbook

Part III: Appendixes
Appendix A. Information About This Book's Virtual Machine Experience
Appendix B. OAuth Primer
Appendix C. Python and IPython Notebook Tips & Tricks

Book Details

  • Paperback: 448 pages
  • Publisher: O’Reilly Media; 2nd Edition (October 2013)
  • Language: English
  • ISBN-10: 1449367615
  • ISBN-13: 978-1449367619
Note: There is a file embedded within this post, please visit this post to download the file.

Related Books

The post Mining the Social Web, 2nd Edition appeared first on Wow! eBook.

Python and HDF5

Posted: 09 Nov 2013 08:14 AM PST

Book Description

Gain hands-on experience with HDF5 for storing scientific data in Python. This practical guide quickly gets you up to speed on the details, best practices, and pitfalls of using HDF5 to archive and share numerical datasets ranging in size from gigabytes to terabytes.

Through real-world examples and practical exercises, you'll explore topics such as scientific datasets, hierarchically organized groups, user-defined metadata, and interoperable files. Examples are applicable for users of both Python 2 and Python 3. If you're familiar with the basics of Python data analysis, this is an ideal introduction to HDF5.

  • Get set up with HDF5 tools and create your first HDF5 file
  • Work with datasets by learning the HDF5 Dataset object
  • Understand advanced features like dataset chunking and compression
  • Learn how to work with HDF5's hierarchical structure, using groups
  • Create self-describing files by adding metadata with HDF5 attributes
  • Take advantage of HDF5's type system to create interoperable files
  • Express relationships among data with references, named types, and dimension scales
  • Discover how Python mechanisms for writing parallel code interact with HDF5

Table of Contents
Chapter 1. Introduction
Chapter 2. Getting Started
Chapter 3. Working with Datasets
Chapter 4. How Chunking and Compression Can Help You
Chapter 5. Groups, Links, and Iteration: The "H" in HDF5
Chapter 6. Storing Metadata with Attributes
Chapter 7. More About Types
Chapter 8. Organizing Data with References, Types, and Dimension Scales
Chapter 9. Concurrency: Parallel HDF5, Threading, and Multiprocessing
Chapter 10. Next Steps

Book Details

  • Paperback: 152 pages
  • Publisher: O’Reilly Media (October 2013)
  • Language: English
  • ISBN-10: 1449367836
  • ISBN-13: 978-1449367831
Note: There is a file embedded within this post, please visit this post to download the file.

Related Books

The post Python and HDF5 appeared first on Wow! eBook.

Test-Driven Infrastructure with Chef, 2nd Edition

Posted: 09 Nov 2013 08:08 AM PST

Book Description

Since Test-Driven Infrastructure with Chef first appeared in mid-2011, infrastructure testing has begun to flourish in the web ops world. In this revised and expanded edition, author Stephen Nelson-Smith brings you up to date on this rapidly evolving discipline, including the philosophy driving it and a growing array of tools. You'll get a hands-on introduction to the Chef framework, and a recommended toolchain and workflow for developing your own test-driven production infrastructure.

Several exercises and examples throughout the book help you gain experience with Chef and the entire infrastructure-testing ecosystem. Learn how this test-first approach provides increased security, code quality, and peace of mind.

  • Explore the underpinning philosophy that infrastructure can and should be treated as code
  • Become familiar with the MASCOT approach to test-driven infrastructure
  • Understand the basics of test-driven and behavior-driven development for managing change
  • Dive into Chef fundamentals by building an infrastructure with real examples
  • Discover how Chef works with tools such as Virtualbox and Vagrant
  • Get a deeper understanding of Chef by learning Ruby language basics
  • Learn the tools and workflow necessary to conduct unit, integration, and acceptance tests

Table of Contents
Chapter 1. The Philosophy of Test-Driven Infrastructure
Chapter 2. An Introduction to Ruby
Chapter 3. An Introduction to Chef
Chapter 4. Using Chef with Tools
Chapter 5. An Introduction to Test- and Behavior-Driven Development
Chapter 6. A Test-Driven Infrastructure Framework
Chapter 7. Test-Driven Infrastructure: A Recommended Toolchain
Chapter 8. Epilogue

Book Details

  • Paperback: 308 pages
  • Publisher: O’Reilly Media; 2nd Edition (October 2013)
  • Language: English
  • ISBN-10: 1449372201
  • ISBN-13: 978-1449372200
Note: There is a file embedded within this post, please visit this post to download the file.

Related Books

The post Test-Driven Infrastructure with Chef, 2nd Edition appeared first on Wow! eBook.

Tidak ada komentar:

Posting Komentar