Limit search to available items
Book Cover
E-book
Author Cuadrado-Gallego, Juan J., author.

Title Data analytics : a theoretical and practical view from the EDISON Project / Juan J. Cuadrado-Gallego, Yuri Demchenko ; with contributions by Josefa Gómez Pérez and Abdelhamid Tayebi Tayebi
Published Cham : Springer, [2023]
©2023

Copies

Description 1 online resource (xiii, 477 pages) : illustrations (some color)
Contents Contents -- Chapter 1. Introduction to data science and data analytics 1 -- 1.1 About Data Science -- 1.2 About the EDISON Project and Data Science Framework -- 1.2.1 The EDISON project -- 1.2.2 The EDISON Data Science Framework -- 1.3 About Data Analytics -- 1.3.1 Data Analytics Competences -- 1.3.2 Data Analytics Body of Knowledge -- 1.3.3 Data Analytics Model Curriculum Approach -- 1.3.4 Data Analytics Professional Profiles -- 1.4 About this Book -- Chapter 2. Data 49 -- A. Theory -- 2.1 Introduction -- 2.2 Characteristic -- 2.2.1 Definition of characteristic -- 2.2.2 Types of characteristics -- 2.3 Data -- 2.3.1 Definition of Data -- 2.3.2 Types of data from their nature -- 2.3.3 Types of data from their storage -- 2.4 Available Data -- 2.4.1 Experiment -- 2.4.2 Data population -- 2.4.3 Data Sample -- 2.4.4 Data Quality -- 2.5 Frequency -- 2.5.1 Definition of frequency -- 2.5.2 Types of frequency -- 2.5.3 Frequency of grouped Data -- 2.5.4 Mode -- 2.6 Mean -- 2.6.1 Definition of Mean -- 2.6.2 Arithmetic Mean -- 2.6.3 Variance and Standard Deviation -- 2.7 Median -- 2.7.1 Range -- 2.7.2 Median -- 2.7.3 Quantiles -- 2.7.4 Quantiles range -- B. Computer Based Solving -- 2.8 Reproject -- 2.9 R graphical user interface -- 2.10 Data exercises solves with R -- C. Data Exercises solves -- 2.11 Handmade exercises -- 2.12 Exercises solves in R -- Annex. Data Extended Concepts -- 2.A.1 Frequency -- 2.A.2 Mean -- Chapter 3. Probability -- A. Theory -- 3.1 Introduction -- 3.2 Event -- 3.3 Sets theory actions and operations -- 3.4 La Place or classic probability -- 3.5 Bayesian Probability -- 3.6 Probability distribution of random variables -- 3.6.1 Random Variable -- 3.6.2 Probability distribution -- 3.6.3 Discrete probability distributions -- 3.6.3.1 Bernoulli Probability distribution -- 3.6.3.2 Binomial Probability distribution -- 3.6.3.3 Geometric Probability distribution -- 3.6.3.4 Poison Probability distribution -- 3.6.4 Continuous probability distribution -- 3.6.4.1 Normal Distribution -- 3.6.4.2 Pearson chi square distribution -- 3.6.4.3 T the student distribution -- 3.6.4.4 F the fisher distribution -- B. Computer Based Solving -- C. Probability exercises solved -- 3.7 Handmade exercises -- 3.8 Exercises solved in R -- Annex. Probability extended concepts -- Chapter 4. Anomaly Detection -- Juan. J Cuadrado-Gallego, Yuri Demchenko, Josefa Gmez, Adelhamid Tayebi -- A. Theory -- 4.1 Introduction -- 4.2 Anomaly detection basic on Statistics -- 4.2.1 Anomaly detection Basic on the mean and the standard deviation -- 4.2.2 Anomaly detection based on the quartiles -- 4.2.3 Anomaly detection based errors of the residuals -- 4.3 Anomaly detection based on proximity. K nearest neighbor algorithm -- 4.4 Anomaly detection based on density simplified local outlier factor algorithm -- B. Computer based solving -- 4.5 R packages -- 4.6 Anomaly detection the exercise solves with R -- C. Anomaly detection exercises solves -- 4.7 Handmade exercises -- 4.8 Exercises solved in R -- -- Chapter 5. Unsupervised Classification -- Juan. J Cuadrado-Gallego, Yuri Demchenko, Adelhamid Tayebi -- A. Theory -- 5.1 Introduction -- 5.2 Unsupervised classification based on distances K Meand Algorithm -- 5.3 Agglomerative hierarchical clustering -- B. Computer Based Solved -- 5.4 R studio -- 5.5 Unsupervised classification exercises solves with R -- C. Unsupervised Classification Solved -- 5.6 Handmade exercises -- 5.7 Exercises solved in R -- -- Chapter 6. Supervised Classification -- Juan. J Cuadrado-Gallego, Yuri Demchenko, Josefa Gmez -- A. Theory -- 6.1 Introduction -- 6.2 Decision tree -- 6.2.1 Optimizing the construction of a decision tree: ID3 Algorithm -- 6.2.2 Optimizing the construction of a decision tree: CART Algorithm -- 6.2.3 Optimizing the construction of a decision tree: Error Algorithm -- 6.3 Neural Network -- 6.4 Nave Bayes -- 6.5 Regression functions -- 6.5.1 Lineal regression of polynomial events -- 6.5.2 Lineal regression of polynomial for three events -- 6.5.3 Lineal regression of polynomial for K events -- 6.5.4 No Lineal regression of polynomial for two events -- 6.5.5 No Lineal regression of not polynomial for two events -- 6.5.6 Lineal regression validity analysis -- B. Computer based solving -- C. Supervised classification analysis exercises solved -- 6.6 Handmade Exercises -- 6.7. Exercises solves in R -- Chapter 7. Association -- A. Theory -- 7.1 Introduction -- 7.2 Analysis of association of events composed by a single elementary event -- 7.2.1 Support -- 7.2.2 Confidence -- 7.2.3 Contingency -- 7.2.4 Correlation -- 7.3 Analysis of association of events composed by more than one elementary event . Apriori algorithm -- B. Computer based solving -- C. Association analysis exercises solved -- 7.4 Handmade Exercises -- 7.5 Exercises solves in R
Summary Building upon the knowledge introduced in The Data Science Framework, this book provides a comprehensive and detailed examination of each aspect of Data Analytics, both from a theoretical and practical standpoint. The book explains representative algorithms associated with different techniques, from their theoretical foundations to their implementation and use with software tools. Designed as a textbook for a Data Analytics Fundamentals course, it is divided into seven chapters to correspond with 16 weeks of lessons, including both theoretical and practical exercises. Each chapter is dedicated to a lesson, allowing readers to dive deep into each topic with detailed explanations and examples. Readers will learn the theoretical concepts and then immediately apply them to practical exercises to reinforce their knowledge. And in the lab sessions, readers will learn the ins and outs of the R environment and data science methodology to solve exercises with the R language. With detailed solutions provided for all examples and exercises, readers can use this book to study and master data analytics on their own. Whether you're a student, professional, or simply curious about data analytics, this book is a must-have for anyone looking to expand their knowledge in this exciting field
Bibliography Includes bibliographical references
Notes Print version record
Subject Quantitative research.
Big data.
Big data
Quantitative research
Form Electronic book
Author Demchenko, Yuri, author
ISBN 9783031391293
3031391292