Doing Data Science Posted: 09 Nov 2013 08:27 AM PST Book Description Now that people are aware that data can make the difference in an election or a business model, data science as an occupation is gaining ground. But how can you get started working in a wide-ranging, interdisciplinary field that's so clouded in hype? This insightful book, based on Columbia University's Introduction to Data Science class, tells you what you need to know. In many of these chapter-long lectures, data scientists from companies such as Google, Microsoft, and eBay share new algorithms, methods, and models by presenting case studies and the code they use. If you're familiar with linear algebra, probability, and statistics, and have programming experience, this book is an ideal introduction to data science. Topics include: - Statistical inference, exploratory data analysis, and the data science process
- Algorithms
- Spam filters, Naive Bayes, and data wrangling
- Logistic regression
- Financial modeling
- Recommendation engines and causality
- Data visualization
- Social networks and data journalism
- Data engineering, MapReduce, Pregel, and Hadoop
Doing Data Science is collaboration between course instructor Rachel Schutt, Senior VP of Data Science at News Corp, and data science consultant Cathy O'Neil, a senior data scientist at Johnson Research Labs, who attended and blogged about the course. Table of Contents Chapter 1. Introduction: What Is Data Science? Chapter 2. Statistical Inference, Exploratory Data Analysis, and the Data Science Process Chapter 3. Algorithms Chapter 4. Spam Filters, Naive Bayes, and Wrangling Chapter 5. Logistic Regression Chapter 6. Time Stamps and Financial Modeling Chapter 7. Extracting Meaning from Data Chapter 8. Recommendation Engines: Building a User-Facing Data Product at Scale Chapter 9. Data Visualization and Fraud Detection Chapter 10. Social Networks and Data Journalism Chapter 11. Causality Chapter 12. Epidemiology Chapter 13. Lessons Learned from Data Competitions: Data Leakage and Model Evaluation Chapter 14. Data Engineering: MapReduce, Pregel, and Hadoop Chapter 15. The Students Speak Chapter 16. Next-Generation Data Scientists, Hubris, and Ethics Book Details - Paperback: 406 pages
- Publisher: O’Reilly Media (October 2013)
- Language: English
- ISBN-10: 1449358659
- ISBN-13: 978-1449358655
Note: There is a file embedded within this post, please visit this post to download the file. Related Books The post Doing Data Science appeared first on Wow! eBook. |
Agile Data Science Posted: 09 Nov 2013 08:23 AM PST Book Description Mining big data requires a deep investment in people and time. How can you be sure you're building the right models? With this hands-on book, you'll learn a flexible toolset and methodology for building effective analytics applications with Hadoop. Using lightweight tools such as Python, Apache Pig, and the D3.js library, your team will create an agile environment for exploring data, starting with an example application to mine your own email inboxes. You'll learn an iterative approach that enables you to quickly change the kind of analysis you're doing, depending on what the data is telling you. All example code in this book is available as working Heroku apps. - Create analytics applications by using the agile big data development methodology
- Build value from your data in a series of agile sprints, using the data-value stack
- Gain insight by using several data structures to extract multiple features from a single dataset
- Visualize data with charts, and expose different aspects through interactive reports
- Use historical data to predict the future, and translate predictions into action
- Get feedback from users after each sprint to keep your project on track
Table of Contents Part I: Setup Chapter 1. Theory Chapter 2. Data Chapter 3. Agile Tools Chapter 4. To the Cloud! Part II: Climbing the Pyramid Chapter 5. Collecting and Displaying Records Chapter 6. Visualizing Data with Charts Chapter 7. Exploring Data with Reports Chapter 8. Making Predictions Chapter 9. Driving Actions Book Details - Paperback: 178 pages
- Publisher: O’Reilly Media (October 2013)
- Language: English
- ISBN-10: 1449326269
- ISBN-13: 978-1449326265
Note: There is a file embedded within this post, please visit this post to download the file. Related Books The post Agile Data Science appeared first on Wow! eBook. |
Mining the Social Web, 2nd Edition Posted: 09 Nov 2013 08:19 AM PST Book Description How can you tap into the wealth of social web data to discover who's making connections with whom, what they're talking about, and where they're located? With this expanded and thoroughly revised edition, you'll learn how to acquire, analyze, and summarize data from all corners of the social web, including Facebook, Twitter, LinkedIn, Google+, GitHub, email, websites, and blogs. - Employ the Natural Language Toolkit, NetworkX, and other scientific computing tools to mine popular social web sites
- Apply advanced text-mining techniques, such as clustering and TF-IDF, to extract meaning from human language data
- Bootstrap interest graphs from GitHub by discovering affinities among people, programming languages, and coding projects
- Build interactive visualizations with D3.js, an extraordinarily flexible HTML5 and JavaScript toolkit
- Take advantage of more than two-dozen Twitter recipes, presented in O'Reilly's popular “problem/solution/discussion” cookbook format
The example code for this unique data science book is maintained in a public GitHub repository. It's designed to be easily accessible through a turnkey virtual machine that facilitates interactive learning with an easy-to-use collection of IPython Notebooks. Table of Contents Part I: A Guided Tour of the Social Web Chapter 1. Mining Twitter: Exploring Trending Topics, Discovering What People Are Talking About, and More Chapter 2. Mining Facebook: Analyzing Fan Pages, Examining Friendships, and More Chapter 3. Mining LinkedIn: Faceting Job Titles, Clustering Colleagues, and More Chapter 4. Mining Google+: Computing Document Similarity, Extracting Collocations, and More Chapter 5. Mining Web Pages: Using Natural Language Processing to Understand Human Language, Summarize Blog Posts, and More Chapter 6. Mining Mailboxes: Analyzing Who's Talking to Whom About What, How Often, and More Chapter 7. Mining GitHub: Inspecting Software Collaboration Habits, Building Interest Graphs, and More Chapter 8. Mining the Semantically Marked-Up Web: Extracting Microformats, Inferencing over RDF, and More Part II: Twitter Cookbook Chapter 9. Twitter Cookbook Part III: Appendixes Appendix A. Information About This Book's Virtual Machine Experience Appendix B. OAuth Primer Appendix C. Python and IPython Notebook Tips & Tricks Book Details - Paperback: 448 pages
- Publisher: O’Reilly Media; 2nd Edition (October 2013)
- Language: English
- ISBN-10: 1449367615
- ISBN-13: 978-1449367619
Note: There is a file embedded within this post, please visit this post to download the file. Related Books The post Mining the Social Web, 2nd Edition appeared first on Wow! eBook. |
Python and HDF5 Posted: 09 Nov 2013 08:14 AM PST Book Description Gain hands-on experience with HDF5 for storing scientific data in Python. This practical guide quickly gets you up to speed on the details, best practices, and pitfalls of using HDF5 to archive and share numerical datasets ranging in size from gigabytes to terabytes. Through real-world examples and practical exercises, you'll explore topics such as scientific datasets, hierarchically organized groups, user-defined metadata, and interoperable files. Examples are applicable for users of both Python 2 and Python 3. If you're familiar with the basics of Python data analysis, this is an ideal introduction to HDF5. - Get set up with HDF5 tools and create your first HDF5 file
- Work with datasets by learning the HDF5 Dataset object
- Understand advanced features like dataset chunking and compression
- Learn how to work with HDF5's hierarchical structure, using groups
- Create self-describing files by adding metadata with HDF5 attributes
- Take advantage of HDF5's type system to create interoperable files
- Express relationships among data with references, named types, and dimension scales
- Discover how Python mechanisms for writing parallel code interact with HDF5
Table of Contents Chapter 1. Introduction Chapter 2. Getting Started Chapter 3. Working with Datasets Chapter 4. How Chunking and Compression Can Help You Chapter 5. Groups, Links, and Iteration: The "H" in HDF5 Chapter 6. Storing Metadata with Attributes Chapter 7. More About Types Chapter 8. Organizing Data with References, Types, and Dimension Scales Chapter 9. Concurrency: Parallel HDF5, Threading, and Multiprocessing Chapter 10. Next Steps Book Details - Paperback: 152 pages
- Publisher: O’Reilly Media (October 2013)
- Language: English
- ISBN-10: 1449367836
- ISBN-13: 978-1449367831
Note: There is a file embedded within this post, please visit this post to download the file. Related Books The post Python and HDF5 appeared first on Wow! eBook. |
Test-Driven Infrastructure with Chef, 2nd Edition Posted: 09 Nov 2013 08:08 AM PST Book Description Since Test-Driven Infrastructure with Chef first appeared in mid-2011, infrastructure testing has begun to flourish in the web ops world. In this revised and expanded edition, author Stephen Nelson-Smith brings you up to date on this rapidly evolving discipline, including the philosophy driving it and a growing array of tools. You'll get a hands-on introduction to the Chef framework, and a recommended toolchain and workflow for developing your own test-driven production infrastructure. Several exercises and examples throughout the book help you gain experience with Chef and the entire infrastructure-testing ecosystem. Learn how this test-first approach provides increased security, code quality, and peace of mind. - Explore the underpinning philosophy that infrastructure can and should be treated as code
- Become familiar with the MASCOT approach to test-driven infrastructure
- Understand the basics of test-driven and behavior-driven development for managing change
- Dive into Chef fundamentals by building an infrastructure with real examples
- Discover how Chef works with tools such as Virtualbox and Vagrant
- Get a deeper understanding of Chef by learning Ruby language basics
- Learn the tools and workflow necessary to conduct unit, integration, and acceptance tests
Table of Contents Chapter 1. The Philosophy of Test-Driven Infrastructure Chapter 2. An Introduction to Ruby Chapter 3. An Introduction to Chef Chapter 4. Using Chef with Tools Chapter 5. An Introduction to Test- and Behavior-Driven Development Chapter 6. A Test-Driven Infrastructure Framework Chapter 7. Test-Driven Infrastructure: A Recommended Toolchain Chapter 8. Epilogue Book Details - Paperback: 308 pages
- Publisher: O’Reilly Media; 2nd Edition (October 2013)
- Language: English
- ISBN-10: 1449372201
- ISBN-13: 978-1449372200
Note: There is a file embedded within this post, please visit this post to download the file. Related Books The post Test-Driven Infrastructure with Chef, 2nd Edition appeared first on Wow! eBook. |
Tidak ada komentar:
Posting Komentar