10-day course on turning Big Data into effective business solutions
|
Data Scientist for Operational Excellence is a training course with the duration of. 10 Days and Certification Day to transform Big Data into Smart Data and create concrete solutions to improve their decision-making, analytics, production, and business processes.
Prerequisites
- Entrance test to assess basic knowledge in statistics and programming languages Pandas, Python™, Jupiter, and Power BI;
- Access to Windows and Pandas, Python™, Jupiter and Power BI software.
The route includes.:
- In-person sessions at the Lean Factory School®;
- Tutoring with support for corporate project work;
- Certification based on the Final Test and Company Project Work.
Contents of the Data Scientist for Operational Excellence pathway.
- Big Data: the centrality of data as a strategic factor for business development
- Process data using industrial statistics tools
- Inferential analysis tools & model fitting
- Programming methods
- Data Collection
- Big Data
- Data Visualization
- Machine Learning
View the Detailed Program
Big Data: the centrality of data as a strategic factor for business development
- Data Driven Economy and Data Driven Decision: the role of data and information in business decisions
- Applications of Big Data Analysis
- Data Science and Data Analysis
- The role of the data scientist in the business organization
Process data using industrial statistics tools
- Introduction to statistics and elements of uncertainty management
- Models of statistical analysis
- Stratification and clustering of samples
- Numerical statistical summary: indicators of central tendency and dispersion
- Graphic statistical synthesis: the construction of effective reports (Histograms, Box-Plots & Whiskers, scatterplots)
- Analysis of sample behaviors: the main continuous and discrete probability distributions
- Process Capability Analysis
Inferential analysis tools & model fitting
- Verification of sample behaviors: analysis of outliers and anomalies
- Industrial use and applications of Hypothesis Testing
- Use and interpretation of ANOVA
- Use and interpretation of regression models
- Pattern analysis and synthesis of residue behavior (MAD, CV, ME,...)
- Hints at Control Cards for behavior verification
Programming methods
- Introduction to programming languages: Phyton™ and the Jupiter Notebook environment
- Typical basic Phyton™ program constructs and syntax from data types to functions
- Libraries: Numpy, for operations on vectors and matrices; matplotlib for visualization
- The Pandas Data Analysis library for data manipulation: importing, analyzing, extracting, sorting, grouping, and exporting data.
Data Collection
- Strategies for collecting, systematizing, and integrating heterogeneous data
- Main data models and formats: SQL vs NoSQL, CSV, Json, images, etc.
- Scraping tools (SW libraries and interactive graphical tools) and REST APIs
Big Data
- Introduction and basic concepts
- Data preparation: data cleaning, normalization, missing data and anomaly management
- Criticality, evolution, tools and platforms for Big Data management
Data Visualization
- Introduction to information visualization (infoview): purpose, fundamentals, patterns and antipatterns
- Tools for Data Visualization: mockups, wireframes & UI Prototyping
- Implementation of interactive dashboards with Power BI
- Models and tools for evaluating interfaces and dashboards
Machine Learning
- Introduction to Machine Learning and evaluative metrics for discrete and continuous problems
- Advantages and disadvantages of supervised and unsupervised systems, applicability of different methodologies to different contests
- Classification and regression
- Classifiers: k-Nearest Neighbor, Support Vector Machine, Random Forest
- Clustering (k-means) and Dimensionality Reduction (PCA).
- Introduction to Deep Learning, Convolutional Neural Networks, Recurrent Networks and Reinforcement Leraning
Register to download Course Schedule