39 Cloud Data: Architecture Gallery
39.1 Learning Objectives
By the end of this chapter, you will be able to:
- Distinguish Architecture Patterns: Compare data lake, data warehouse, and lakehouse patterns from visual diagrams
- Trace Data Flow: Map data from IoT devices through processing pipelines to analytics endpoints
- Apply Sensor Fusion Concepts: Explain Kalman filters, particle filters, and probabilistic approaches for multi-sensor integration
- Design ML Pipelines: Plan predictive model deployment pipelines for IoT applications
Key Concepts
- Reference architecture: A standardised, reusable architectural blueprint for a class of systems, providing a proven starting point that avoids reinventing common design decisions.
- IoT data tier: A logical layer in a cloud architecture responsible for ingesting, storing, and serving sensor data, typically composed of message brokers, time-series databases, and object storage.
- Data lakehouse: A modern architecture combining the low-cost storage of a data lake with the structured query performance of a data warehouse, suitable for mixed IoT analytics workloads.
- Event-driven architecture: A design pattern where components communicate by publishing and subscribing to events (sensor readings, alarms) rather than through direct API calls, enabling loose coupling and scalability.
- Zonal redundancy: Deploying cloud infrastructure across multiple availability zones within a region to survive the failure of any single physical facility without service interruption.
- SLA (Service Level Agreement): A contractual commitment specifying uptime, latency, and data retention guarantees for a cloud IoT platform — a key factor when selecting managed services.
For Beginners: Cloud Data Architecture
Cloud data architecture is the blueprint for organizing IoT data in cloud services. Think of designing a warehouse layout – you need receiving docks for incoming data, organized storage areas, processing stations, and shipping areas where insights are delivered. A good architecture keeps data flowing efficiently from sensors to dashboards.
39.2 Prerequisites
Before diving into this chapter, you should be familiar with:
- Cloud Data: IoT Reference Model Levels 5-7: Understanding the upper levels of the IoT reference model
- Cloud Data: Platforms and Services: Cloud service models and platform options
- Cloud Data: Quality and Security: Data cleaning and security considerations
39.3 Cloud Data Architecture Visuals
39.3.1 Storage and Processing Patterns
Batch Processing
Batch processing handles large volumes of historical IoT data through scheduled jobs, enabling complex analytics that don’t require real-time results.
Data Fusion Architecture
Data fusion combines information from multiple IoT sensors to produce more accurate and reliable insights than any single sensor could provide.
Data Lake Pipeline
Data lakes store raw IoT data in native format, enabling schema-on-read flexibility for diverse analytics and machine learning workloads.
Data Lake
Data lakes provide cost-effective storage for massive IoT data volumes, supporting both structured and unstructured data types.
Data Lakehouse
Data lakehouses combine the flexibility of data lakes with the query performance of data warehouses, ideal for IoT analytics at scale.
Data Warehouse
Data warehouses provide structured, optimized storage for IoT analytics with pre-defined schemas enabling fast business intelligence queries.
Data Mesh
Data mesh applies domain-driven design to IoT data, with decentralized ownership and federated governance enabling scalable data management.
Data Replication
Data replication ensures IoT data availability and durability through redundant copies across storage systems and geographic regions.
Data Lineage
Data lineage tracks IoT data from sensor origin through all transformations to final consumption, enabling debugging and compliance.
Data Pipeline Orchestration
Pipeline orchestration manages complex IoT data workflows with scheduling, dependency tracking, monitoring, and automated recovery.
Stream Processor
Stream processors enable real-time IoT analytics through continuous processing of sensor data as it arrives.
Stream Processing Flow
End-to-end stream processing pipelines provide reliable real-time IoT analytics with guaranteed delivery semantics.
Time Series
Time series data is the fundamental data type in IoT, requiring specialized storage and analytics techniques.
Time Synchronization
Accurate time synchronization is essential for correlating IoT events across distributed sensor networks.
Converged Data Networks
Converged networks efficiently carry IoT data streams alongside other enterprise traffic through unified infrastructure.
Edge-Cloud Placement
Optimal placement of IoT processing between edge and cloud depends on latency, bandwidth, privacy, and computational requirements.
39.3.2 Machine Learning and Prediction
Predictive Model
Predictive models transform IoT sensor patterns into actionable forecasts for maintenance, demand, and anomaly detection.
Model Registry
Model registries manage the lifecycle of IoT machine learning models from training through deployment and monitoring.
39.3.3 Sensor Fusion and Filtering
Kalman Filter
The Kalman filter provides optimal sensor fusion for noisy IoT measurements with known dynamics and noise characteristics.
Kalman Filter Limitations
Understanding Kalman filter limitations guides selection of appropriate filtering techniques for different IoT scenarios.
Three Equations
The three core Kalman filter equations enable recursive optimal estimation from noisy IoT sensor measurements.
Particle Filter Location
Particle filters enable robust localization for IoT devices when sensor noise is non-Gaussian or system dynamics are nonlinear.
Particle Filter Correction
Particle filter correction updates position estimates by weighting particles according to sensor measurement likelihood.
Particle Filter Correction 2
Advanced particle filtering combines multiple IoT sensor modalities for robust state estimation in challenging environments.
Particle Filter Resampling
Resampling maintains particle diversity in IoT tracking applications, preventing degeneracy as estimates converge.
Particle Filter Resampling 2
Adaptive resampling optimizes computational efficiency while maintaining tracking accuracy in resource-constrained IoT devices.
Particle Filter Resampling 3
Different resampling algorithms offer trade-offs between variance reduction and computational complexity for IoT applications.
Filters and Smoothers
Smoothers provide superior estimates when post-processing IoT data offline, while filters work in real-time with causal data.
Probabilistic Approach
Probabilistic approaches maintain uncertainty estimates alongside point values, enabling informed decision-making in IoT systems.
Recursive Bayesian Filters
Recursive Bayesian filters provide a principled framework for sequential sensor fusion in IoT applications.
39.3.5 Data Quality and Calibration
Calibrated Data
Sensor calibration corrects systematic errors in IoT measurements, improving accuracy across operating conditions.
Noisy Measurements
Understanding noise sources in IoT sensors guides selection of appropriate filtering and fusion techniques.
Symmetry Problem
Sensor symmetry creates ambiguity in orientation estimation that requires additional measurements or constraints to resolve.
39.3.6 State Estimation and Tracking
State Vector
State vectors capture all relevant information about IoT device state for estimation and prediction algorithms.
Simple Tracking Example
Simple tracking demonstrates fundamental concepts of state estimation from noisy IoT sensor measurements.
Simple Tracking Example 2
Handling measurement gaps is essential for robust IoT tracking when sensors intermittently lose signal.
Complex Tracking Example
Complex tracking scenarios require sophisticated data association and track management for multiple IoT targets.
Works Well Scenario
Understanding favorable conditions helps design IoT deployments that maximize tracking performance.
39.3.7 Cloud and Infrastructure
Service Models
Cloud service models define responsibility boundaries for IoT deployments, from infrastructure to complete applications.
SKA Data Challenge
The SKA represents extreme IoT data challenges, generating exabytes of sensor data requiring innovative processing solutions.
Cloud Usage Patterns
Cloud-based IoT usage involves coordinated patterns of device registration, data ingestion, rules-based processing, analytics, and application integration (see Figure 39.3 above for the end-to-end workflow).
Networking
Cloud networking for IoT requires careful security configuration while enabling reliable device connectivity.
Sensing Resource Intensive
Resource-intensive sensing applications require careful architecture balancing edge processing with cloud capabilities.
39.3.8 Data Processing Details
Information Flow Types
Different information flow patterns suit different IoT application requirements for latency, throughput, and reliability.
Data Generation Table
Understanding data generation rates by IoT device type enables appropriate infrastructure sizing and cost estimation.
Data Generation Table 2
Projected IoT data growth drives architecture decisions for scalable data management infrastructure.
Implementation
Practical IoT data pipeline implementation requires careful technology selection and integration planning.
The Nitty Gritty
Understanding low-level data handling details is essential for efficient IoT system implementation.
Key Takeaway
Cloud data architectures combine multiple patterns to serve different IoT needs: data lakes for flexible raw storage, data warehouses for fast structured queries, stream processors for real-time analytics, and sensor fusion algorithms like Kalman and particle filters for extracting accurate state estimates from noisy measurements. Selecting the right combination of patterns depends on your latency, accuracy, and cost requirements.
Putting Numbers to It
Kalman Filter State Estimation: A GPS tracker reports position with \(\pm 10\text{ m}\) error (\(\sigma_{\text{GPS}} = 10\)). We predict movement based on last velocity: predicted position has \(\pm 5\text{ m}\) uncertainty (\(\sigma_{\text{predict}} = 5\)).
Kalman Gain determines optimal weight between prediction and measurement: \[K = \frac{\sigma_{\text{predict}}^2}{\sigma_{\text{predict}}^2 + \sigma_{\text{GPS}}^2} = \frac{25}{25 + 100} = \frac{25}{125} = 0.2\]
Predicted position: \((100, 200)\) meters. GPS measurement: \((110, 205)\) meters. Kalman filter fuses both:
\[\text{Best estimate} = \text{prediction} + K \times (\text{measurement} - \text{prediction})\]
\[x = 100 + 0.2 \times (110 - 100) = 100 + 2 = 102 \text{ m}\] \[y = 200 + 0.2 \times (205 - 200) = 200 + 1 = 201 \text{ m}\]
Final position: \((102, 201)\) with reduced uncertainty \(\sigma_{\text{fused}} = \sqrt{(1-K) \times \sigma_{\text{predict}}^2} = \sqrt{0.8 \times 25} = 4.47\text{ m}\). Fusion reduces error from 10m (GPS alone) to 4.47m—55% improvement.
39.3.9 Interactive: Kalman Filter Gain Explorer
Experiment with different sensor uncertainties to see how the Kalman gain changes. A higher gain means the filter trusts the measurement more; a lower gain means it trusts the prediction more.
Worked Example: Designing a Data Lake vs Data Warehouse for Smart City IoT
A smart city project collects data from 100,000 sensors (traffic cameras, air quality, parking, weather) generating 500 GB/day. The city needs both real-time dashboards (traffic flow) and historical analytics (urban planning). Should they use a data lake, data warehouse, or both?
Requirement Analysis:
| Use Case | Data Type | Query Pattern | Latency | Users |
|---|---|---|---|---|
| Traffic dashboards | Live camera object counts | Pre-aggregated metrics, time-series | <5 seconds | 50 traffic operators |
| Air quality alerts | Sensor readings (PM2.5, NO2) | Threshold checks, spatial queries | <1 minute | 200 citizens via app |
| Urban planning | 5 years historical data (500 GB/day x 365 x 5 = 912 TB) | Complex JOINs, correlations across datasets | Minutes to hours | 20 city planners |
| ML model training | Raw sensor data + weather + events | Full dataset scans, feature engineering | Hours | 5 data scientists |
Architecture Decision: Data Lakehouse (Hybrid Approach)
Layer 1 - Data Lake (Bronze):
- Store all raw sensor data in AWS S3 (or equivalent)
- Format: Parquet files partitioned by sensor_type/date
- Purpose: Long-term retention, ML training, exploratory analysis
- Cost: 912 TB x $0.023/GB/month = $20,976/month
Layer 2 - Data Warehouse (Silver):
- AWS Redshift cluster for structured, aggregated data
- ETL pipeline: Hourly jobs aggregate raw data to 1-minute summaries
- Reduces data from 500 GB/day raw to 5 GB/day aggregates (100x compression)
- Purpose: Fast BI queries, operational dashboards
- Cost: dc2.large 4-node cluster = $4,380/month
Layer 3 - Real-Time Stream (Gold):
- Apache Kafka + Flink for live sensor ingestion and aggregation
- Materialize traffic counts, air quality averages to Redis for <1s dashboard queries
- Purpose: Real-time monitoring and alerts
- Cost: 3 Kafka brokers + 2 Flink workers = $1,800/month
Total Architecture Cost: $27,156/month
Query Performance Comparison:
| Query Type | Data Lake Only | Data Warehouse Only | Lakehouse (Hybrid) |
|---|---|---|---|
| “Show traffic counts for last hour” | 30 seconds (scan 20 GB Parquet) | 2 seconds (indexed aggregates) | 0.5 seconds (Redis cache) |
| “Find correlation between air quality and traffic over 3 years” | 10 minutes (full scan) | Not possible (raw data deleted) | 12 minutes (Spark on data lake) |
| “Train ML model on 5 years of sensor data” | 2 hours (Spark on S3) | Not possible (aggregates lose detail) | 2 hours (Spark on S3) |
| “Alert if PM2.5 >100 in any district” | 45 seconds (too slow) | 5 seconds (query warehouse) | <1 second (stream processing) |
Key Insight: Data lakehouse combines benefits of both architectures: - Data lake for raw storage (cheapest: $0.023/GB/month) and ML training - Data warehouse for fast BI queries on aggregates (100x smaller, 10x faster) - Stream layer for real-time monitoring (<1s latency)
Alternative Approaches (Not Chosen):
Data Warehouse Only: Cannot support ML training (aggregates lose raw detail), expensive for 912 TB storage ($125K/month in Redshift vs $21K in S3)
Data Lake Only: Query latency too high for operational dashboards (30s vs <1s with warehouse caching), poor support for concurrent BI users
Separate Data Lake + Warehouse with ETL: Duplicate storage costs (raw + aggregates), ETL complexity, data staleness issues
Decision Framework: Selecting Cloud Data Architecture Pattern
| Pattern | Best For | Cost ($/GB/month) | Query Speed | Schema Flexibility | When NOT to Use |
|---|---|---|---|---|---|
| Data Lake | Raw data archive, ML training, exploratory analytics | $0.02-0.05 | Slow (minutes) | High (schema-on-read) | Real-time dashboards, high-concurrency BI |
| Data Warehouse | Business intelligence, structured reports, OLAP | $0.25-1.50 | Fast (seconds) | Low (schema-on-write) | Unstructured data, rapid schema changes, petabyte scale |
| Data Lakehouse | Combined analytics + ML, cost-effective at scale | $0.03-0.10 | Medium (10s-min) | Medium | Simple use cases (pick lake OR warehouse) |
| Stream Processor | Real-time analytics, live dashboards, event-driven | $0.10-0.30 | Very fast (<1s) | High | Historical queries, batch analytics |
| Time-Series DB | Sensor data, metrics, monitoring | $0.30-1.00 | Fast (sub-second) | Low (fixed schema) | Non-time-series data, ad-hoc queries |
Decision Tree:
- Do you need real-time results (<5 seconds)?
- Yes → Stream Processor (Kafka + Flink) or Time-Series DB (InfluxDB)
- No → Continue
- Is your data mostly time-series sensor readings?
- Yes → Time-Series DB (InfluxDB, TimescaleDB, Amazon Timestream)
- No → Continue
- Do you need to run ML models on raw data?
- Yes + Need fast BI → Data Lakehouse (Databricks, Snowflake)
- Yes + No BI → Data Lake (S3 + Athena/Spark)
- No → Continue
- Is your schema stable and queries well-defined?
- Yes + <10 TB → Data Warehouse (Redshift, BigQuery)
- Yes + >10 TB → Data Lakehouse (more cost-effective)
- No → Data Lake (schema-on-read flexibility)
- What’s your query concurrency?
50 concurrent users → Data Warehouse (optimized for OLAP)
- <10 users → Data Lake (batch processing)
- Mixed → Data Lakehouse
Common Mistake: Using Data Warehouse for IoT Raw Sensor Storage
The Error: An industrial IoT company stores raw vibration sensor data (10 kHz sampling) from 500 machines in AWS Redshift data warehouse, paying $45,000/month for storage alone.
The Math:
- Sensors: 500 machines x 10,000 samples/sec x 4 bytes = 20 MB/s raw data
- Daily: 20 MB/s x 86,400 s = 1.73 TB/day
- 90-day retention: 1.73 TB x 90 = 155 TB
- Redshift cost: 155 TB x $0.29/GB/month = $44,950/month storage
Why This Is Wrong: Data warehouses are optimized for: - Structured, aggregated data (summaries, not raw samples) - Complex SQL queries with JOINs across dimensions - Low-latency BI dashboards (<5 seconds)
IoT raw sensor data is: - High-volume, low-value until aggregated - Rarely queried directly (only during debugging) - Better suited for batch processing (Spark jobs)
Correct Approach - Tiered Storage:
Tier 1 - S3 Data Lake (Raw):
- Store 155 TB raw data in S3 Standard-IA
- Cost: 155 TB x $0.0125/GB/month = $1,938/month (23x cheaper)
- Query via Athena or Spark when needed (rare)
Tier 2 - Redshift (Aggregates):
- ETL pipeline: Compress 1.73 TB/day raw to 1.7 GB/day aggregates (1000x reduction)
- Extract FFT features (20 frequency bins per machine per second)
- Store: machine_id, timestamp, freq_bin_1..20, anomaly_score
- 90-day aggregates: 1.7 GB x 90 = 153 GB
- Redshift cost: 153 GB x $0.29/GB/month = $44/month
Tier 3 - Redis (Real-Time Cache):
- Cache last 1 hour of aggregates for live dashboards
- Cost: ElastiCache r5.large = $150/month
Total New Cost: $1,938 + $44 + $150 = $2,132/month (vs $45,000)
Savings: $42,868/month = $514,416/year
Query Performance:
- Live dashboard (last hour): 0.5s (Redis cache) - faster than before
- Historical analysis (last 90 days): 3s (Redshift aggregates) - same as before
- Raw data investigation (rare): 5 minutes (Athena scan of S3) - acceptable for debugging
Key Lesson: Data warehouses charge premium prices for premium performance on structured data. IoT raw sensor streams are unstructured, high-volume, and rarely queried. Store raw data in cheap object storage (S3), aggregate to warehouse only what you query frequently. This pattern saves 95%+ on storage costs while maintaining (or improving) query performance for actual use cases.
39.3.10 Interactive: IoT Storage Cost Estimator
Estimate the monthly cost difference between storing IoT sensor data in a data warehouse versus a tiered architecture (data lake + warehouse for aggregates).
For Kids: Meet the Sensor Squad!
Imagine a giant picture book that shows how data travels from tiny sensors all the way to big computers in the sky!
39.3.11 The Sensor Squad Adventure: The Architecture Picture Book
One rainy afternoon, Sammy the Sensor found a huge picture book in the library called “The Amazing Journey of Data.” She called her friends over to look.
The first page showed a Data Lake – a giant pool where ALL kinds of data splash in together. “It’s like a swimming pool where numbers, pictures, and words all swim around!” said Sammy. “You can dive in and find whatever you need.”
The next page showed a Data Warehouse – neat shelves with perfectly organized boxes. “This is more like a tidy cupboard,” explained Max the Microcontroller. “Everything is sorted and labeled so you can grab what you need super fast.”
Lila the LED pointed to a page showing Stream Processing – data flowing like a river with little workers checking each drop as it passes. “These workers read each piece of data as it flows by, looking for anything unusual. If they spot something hot or cold, they shout an alert!”
Bella the Battery’s favorite page showed the Kalman Filter – a clever helper that combines guesses with measurements. “Imagine you’re trying to guess where a ball is going. The Kalman Filter takes your best guess AND what your eyes see, then combines them for a SUPER accurate answer!”
“This picture book shows all the different ways we can organize and use data,” said Max. “Architects use these patterns to build amazing IoT systems!”
39.3.12 Key Words for Kids
| Word | What It Means |
|---|---|
| Data Lake | A big storage pool where all kinds of raw data are kept together |
| Data Warehouse | An organized storage where data is neatly sorted for fast searching |
| Architecture | The plan for how all the parts of a system fit together |
| Kalman Filter | A smart math trick that combines guesses and measurements for better accuracy |
39.4 Quiz: Cloud Data Architecture
- Which IoT Reference Model level handles data reconciliation and normalization?
- Level 3: Edge/Fog Computing
- Level 4: Data Accumulation
- Level 5: Data Abstraction
- Level 6: Application
- What does IaaS stand for in cloud computing?
- Internet as a Service
- Infrastructure as a Service
- Integration as a Service
- Information as a Service
- Which is NOT one of the top cloud security threats identified by Cloud Security Alliance?
- Data breaches
- Weak identity management
- High bandwidth costs
- Insecure APIs
- In data cleaning, what should happen to a temperature reading of 150 degrees C when the maximum allowed is 60 degrees C?
- Delete the entire record
- Mark as suspicious and clamp to maximum (60 degrees C)
- Accept as valid
- Ignore and continue
- What is data provenance?
- Where data is physically stored
- How much data is generated
- Recording sources and transformations of data
- The speed of data transmission
- Which cloud service model provides complete applications to users?
- IaaS
- PaaS
- SaaS
- FaaS
- In a 3-year TCO comparison, if cloud costs $100k and on-premises costs $180k, what is the savings percentage with cloud?
- 25%
- 44%
- 56%
- 80%
- Which of the 4 Vs benefits most directly from cloud parallel processing?
- Volume
- Velocity
- Variety
- Veracity
- What is the main advantage of tracking data freshness?
- Reduce storage costs
- Determine reliability for time-sensitive decisions
- Improve network speed
- Simplify data formats
- In on-premises deployment, which is typically the largest ongoing annual cost?
- Power
- Hardware maintenance
- IT staffing
- Network bandwidth
Answers: 1-C, 2-B, 3-C, 4-B, 5-C, 6-C, 7-B, 8-B, 9-B, 10-C
Common Pitfalls
1. Selecting an architecture gallery pattern without validating against your scale
A reference architecture designed for 10 million devices may be wildly over-engineered for a 500-device pilot. Always validate the selected pattern against your actual throughput, latency, and cost requirements before committing.
2. Copying a reference architecture without understanding its trade-offs
Every architecture gallery entry makes implicit assumptions (24/7 connectivity, uniform message rates, centralised identity). Understand which assumptions your deployment violates before adapting the pattern.
3. Ignoring the operational complexity of multi-service architectures
A gallery architecture with 12 cloud services looks elegant on a slide but requires expertise in each service for operations. Start with the simplest architecture that meets requirements and add services only when needed.
4. Not accounting for egress costs in multi-region architectures
Moving IoT data between cloud regions or from cloud to on-premises for hybrid architectures incurs significant data transfer costs. Model these costs explicitly in architecture selection.
39.5 Summary
This gallery provides visual references for cloud data architecture patterns essential for IoT deployments. Data lakes and warehouses serve different analytical needs, with lakehouses combining benefits of both. Stream processing enables real-time analytics while batch processing handles complex historical analysis. Sensor fusion techniques including Kalman and particle filters extract accurate state estimates from noisy measurements. Machine learning pipelines transform IoT data into predictive insights for maintenance, demand forecasting, and anomaly detection.
Concept Relationships
Cloud Data Architecture Gallery serves as:
- Visual Reference Library: Diagrams illustrate abstract architectural patterns discussed across data chapters
- Architecture Pattern Catalog: Each pattern (data lake, warehouse, lakehouse, stream processor, sensor fusion) solves specific IoT challenges
- Decision Support: Comparing visual architectures helps teams align on designs during planning
Pattern Relationships:
Data Lake (flexible, schema-on-read)
↓
Data Lakehouse (lake storage + warehouse performance)
↓
Data Warehouse (rigid, schema-on-write, fast queries)
Stream Processing → Real-time path
Batch Processing → Historical path
Lambda Architecture → Both combined
Integration Points:
- Cloud platforms (Cloud Data Platforms) implement these patterns as managed services
- Sensor fusion algorithms enable multi-sensor IoT applications requiring probabilistic state estimation
- ML pipelines transform patterns from this gallery into production inference systems
Key Insight: Visual architecture patterns provide shared language for cross-functional teams. “Let’s use a lakehouse pattern” immediately communicates hundreds of implementation decisions without lengthy explanations.
See Also
Within This Module:
- Cloud Data Platforms - AWS, Azure implementations of these patterns
- Cloud Data Quality - Data validation applied to these architectures
- Big Data Pipelines - Lambda architecture detailed explanation
Pattern Implementation Examples:
- Data Lake vs Warehouse Worked Example - Cost/performance tradeoffs
- Sensor Fusion Applications - Kalman filter use cases
- ML Model Deployment - Production ML architectures
External Visual Resources:
- AWS Architecture Icons - Official AWS diagram assets
- Azure Architecture Center - Reference architectures
- Martin Fowler’s Architecture Patterns - Enterprise patterns
Academic Foundations:
- Kalman, R. E. (1960) - Original Kalman filter paper (NASA applications)
- Thrun, et al. (2005) “Probabilistic Robotics” - Particle filters and sensor fusion
39.6 What’s Next
| If you want to… | Read this |
|---|---|
| Understand the IoT cloud reference model | Cloud Data IoT Reference Model |
| Explore specific cloud platforms and services | Cloud Data Platforms and Services |
| Learn about data quality and security in cloud architectures | Cloud Data Quality and Security |
| Study data processing foundations | Data in the Cloud |
| Return to the module overview | Big Data Overview |
Related Chapters & Resources
Cloud & Edge Architecture:
- Edge Fog Computing - Edge vs cloud trade-offs
- Cloud Computing - Cloud infrastructure fundamentals
Data Management:
- Data Storage and Databases - Storage options
- Big Data Overview - Big data concepts
- Modeling and Inferencing - ML model development
Learning Hubs:
- Quiz Navigator - Test your cloud knowledge