Advanced data analytics using Python with architectural patterns, text and image classification, and optimization techniques /

Understand advanced data analytics concepts such as time series and principal component analysis with ETL, supervised learning, and PySpark using Python. This book covers architectural patterns in data analytics, text and image classification, optimization techniques, natural language processing, an...

Full description

Main Author: Mukhopadhyay, Sayan,
Other Authors: Samanta, Pratip,, SpringerLink (Online service)
Format: eBook
Language: English
Published: New York, NY : Apress, [2023]
Physical Description: 1 online resource (259 pages) : illustrations.
Edition: Second edition.
Subjects:
Table of Contents:
  • Intro
  • Table of Contents
  • About the Authors
  • About the Technical Reviewer
  • Acknowledgments
  • Introduction
  • Chapter 1: A Birds Eye View to AI System
  • OOP in Python
  • Calling Other Languages in Python
  • Exposing the Python Model as a Microservice
  • High-Performance API and Concurrent Programming
  • Choosing the Right Database
  • Summary
  • Chapter 2: ETL with Python
  • MySQL
  • How to Install MySQLdb?
  • Database Connection
  • INSERT Operation
  • READ Operation
  • DELETE Operation
  • UPDATE Operation
  • COMMIT Operation
  • ROLL-BACK Operation
  • Normal Forms.
  • First Normal Form
  • Second Normal Form
  • Third Normal Form
  • Elasticsearch
  • Connection Layer API
  • Neo4j Python Driver
  • neo4j-rest-client
  • In-Memory Database
  • MongoDB (Python Edition)
  • Import Data into the Collection
  • Create a Connection Using pymongo
  • Access Database Objects
  • Insert Data
  • Update Data
  • Remove Data
  • Cloud Databases
  • Pandas
  • ETL with Python (Unstructured Data)
  • Email Parsing
  • Topical Crawling
  • Crawling Algorithms
  • Summary
  • Chapter 3: Feature Engineering and Supervised Learning
  • Dimensionality Reduction with Python
  • Correlation Analysis.
  • Principal Component Analysis
  • Mutual Information
  • Classifications with Python
  • Semi-Supervised Learning
  • Decision Tree
  • Which Attribute Comes First?
  • Random Forest Classifier
  • Naïve Bayes Classifier
  • Support Vector Machine
  • Nearest Neighbor Classifier
  • Sentiment Analysis
  • Image Recognition
  • Regression with Python
  • Least Square Estimation
  • Logistic Regression
  • Classification and Regression
  • Intentionally Bias the Model to Over-Fit or Under-Fit
  • Dealing with Categorical Data
  • Summary
  • Chapter 4: Unsupervised Learning: Clustering
  • K-Means Clustering.
  • Choosing K: The Elbow Method
  • Silhouette Analysis
  • Distance or Similarity Measure
  • Properties
  • General and Euclidean Distance
  • Squared Euclidean Distance
  • Distance Between String-Edit Distance
  • Levenshtein Distance
  • Needleman-Wunsch Algorithm
  • Similarity in the Context of a Document
  • Types of Similarity
  • Example of K-Means in Images
  • Preparing the Cluster
  • Thresholding
  • Time to Cluster
  • Revealing the Current Cluster
  • Hierarchical Clustering
  • Bottom-Up Approach
  • Distance Between Clusters
  • Single Linkage Method
  • Complete Linkage Method.