Kindle Matt Taddy Ç Business Data Science Epub ´ Business Data eBook Ç

Publisher's Note Products purchased from Third Party sellers are not guaranteed by the publisher for uality authenticity or access to any online entitlements included with the product Use machine learning to understand your customers frame decisions and drive value The business analytics world has changed and Data Scientists are taking over Business Data Science takes you through the steps of using machine learning to implement best in class business data science Whether you are a business leader with a desire to go deep on data or an engineer who wants to learn how to apply Machine Learning to business problems you'll find the information insight and tools you need to flourish in today's data driven economy You'll learn how to Use the key building blocks of Machine Learning sparse regularization out of sample validation and latent factor and topic modeling Understand how use ML tools in real world business problems where causation matters that correlation Solve data science programs by scripting in the R programming languageToday's business landscape is driven by data and constantly shifting Companies live and die on their ability to make and implement the right decisions uickly and effectively Business Data Science is about doing data science right It's about the exciting things being done around Big Data to run a flourishing business It's about the precepts principals and best practices that you need know for best in class business data science


10 thoughts on “Business Data Science

  1. says:

    Matt Taddy's Business Data Science should be reuired reading for anyone doing applied econometrics in 2019 While the title makes it sound like any one of dozens of books on introductory statistics for business rehashing 30 or even 100 year old material for bored business school students in a mandatory class this is actually a tastefully arranged tour and hands on introduction to the tools and practices that a current day data scientist at a top tier company will use in solving real business problems by an economist with practical experience at Ebay Microsoft and as well as an academic record spanning top tier publications in both economics and ML Concretely that means while it eases you in with basic regression it proceeds uickly to modern statistical learning tools starting with the bootstrap moving to regularized regression and classification and reaching to tree based methods and unsupervised learning all accompanied by in line code in R and abundant practical advice from years of real world experience The selection of topics is opinionated rather than encyclopedic; methods that are found useful including particular variants of the lasso get detailed code examples often coming from Taddy's own research while methods which are now mostly passé like SVMs get a dismissive sentence or two which is appropriate for an introduction The general subject matter and style is comparable to books like Hastie Tibshirani and Friedman's Elements of Statistical Learning and Efron and Hastie's Computer Age Statistical Inference but the focus applications and presumed reader knowledge are targeted to business and economics in a way which will make it a better introduction for that class of readersOne of the real standout features of this book is the real data examples for which the real code introduces you through learning by doing to the kinds of data cleaning and formatting and computational implementation details like the very nice coverage of sparse matrix and parallel and distributed computing facilities in R along with candid admissions of when Python would be preferred instead that make up so much of the work life of the professional data scientist but are nowhere to be seen in your typical intro class where students work with mtcars iris UCI repository data sets and MNIST etc There are some of these canned data sets in the book for simple examples but enough real ones that you get the feel for it Having all of this inline rather than hidden away or in some special data cleaning chapter makes for a bit challenging reading for the reader not well versed in R and I personally kept an R terminal open while reading to try out bits needing clarification but presenting it as it came up for the specific issues for each specific data set kept it manageable while also demonstrating what it reuires in practice The Business in the title is not just because the author taught in a business school; not just business and economic examples but careful thought about when and how the methods should be used for business goals are integrated throughout Causal inference gets two chapters covering experiments and control which go over AB testing and a uick run through the Mostly Harmless playbook that makes up most of the distinctive econometric knowledge of many applied economists IV RD Diff in diff you know the drill but also recent methods incorporating machine learning components like variants of orthogonal Doubly robust estimators and causal trees and forests for high dimensional control and heterogeneous treatment effects estimation The discussion of demand estimation here is particularly well informed Much of Taddy's own work has been on use of text data and the discussion is highly practical focusing on preprocessing tasks and issues with use an interpretation of simple methods like lasso and partial least suares text regression on the supervised side and PCA and LDA on the unsupervised side This like the book's heavy focus on lasso methods reflects a taste for simple scalable and interpretable methods with stable well established implementations which is appropriate both for an introductory text and for real business decisions which reuire input that takes both data and real thinking rather than just ML black boxes Deep learning is postponed to the last chapter which is descriptive than instructional which may be disappointing for people swept up in the hype and those unfortunate engineers at companies that mandate all data analysis be performed remotely via a Tensorflow API but is defensible for the business and economics intended audience I personally introduce a little bit of Keras when teaching similar material just to demonstrate that deep learning is no challenging than any of the rest of the material but given how fast the frameworks change and the abundance of alternative resources it's fine to leave it outIn terms of complaints aside from not having released the book earlier when I was teaching classes on similar material and had to build up notes from scattered papers and software guides I don't have many Some of the figures and math notation get jumbled up or cut off in the Kindle edition which is a minor annoyance The included subsection on inference for the lasso while appropriately heavily caveated gives readers by its very presence the pernicious idea that standard errors for lasso coefficients is an idea that makes any sense at all rather than an elementary misunderstanding that overzealous and underinformed referees should be persuaded out of Specifically in high dimensions where a bias variance tradeoff is unavoidable valid confidence intervals for a regularized estimator cannot be made to be centered on the estimates at least not without being massively inflated One can produce intervals for the underlying model or for particular functionals of the model but jointly achieving rate optimal predictions and rate optimal valid freuentist inference for a model of this type is known to be simply impossible Giné and Nickl's book is a useful source on this The proposed purely heuristic approaches based on undersmoothing and subsampling seem unnecessary given that the later causal chapter covers a semiparametric approach for estimation of functionals which is demonstrably fit for purpose As a matter of disclosure I have not personally been forced into this by misguided editor or referee but I have repeatedly been told by economists reluctant to use machine learning or even classical nonparametric methods that expected editors' demands for valid confidence intervals are holding them back from applying them So devising potemkin methods designed only to get an editor off your back without genuinely solving the problem rather than forcefully explaining what cannot and can be done here including valid inference as a separate task from point estimation seems like a missed opportunityOverall this book is an incomparable resource bringing modern data science practices into reach for applied economists who even at the forefront could learn a lot of immediate practical relevance from it


  2. says:

    The special feature of Business data science is that it includes deep theoretical foundation built upon the author’s academic research and VERY practical tips on applying algorithms to business for real from the author’s industry experience This is rare and invaluable As a professional in data field when reading the book I consistently switched between “ok now I understand this method’s theoretical foundation” and “wow I should try this trick in my project” I also like the writing style with examples from various fields and the author’s clear explanation sometimes it feels like an in person chat with the author It was indeed a pleasure reading experience not common for a fairly technical bookI especially like the final chapter Artificial Intelligence where the author presents his thought on where the industry is heading towards and what talent will be key for this exciting future I reread this chapter a couple of timesOverall a great book for data scientists to enhance theoretical foundation expand tool sets and plan for career development


  3. says:

    Covering a lot of statistics with snippets of R code and a touch of Python you will get a good overview of business oriented data science in this book A caution if you are not reasonably strong on your stats before you dive into this book you will uickly become befuddled I've got 12 credit hours of stats and found the book a challenge Having said that it covers the basics well and touches on pretty much everything from simple linear regression to machine learning and AI If this is your discipline it's a good intro