19 Fog Resource Allocation

Share Scarce Fog Capacity With Evidence

edge-fog

opt

resource

allocation

In 60 Seconds

Fog resource allocation is the discipline of deciding which workloads may use scarce local compute, memory, accelerator, storage, network, and energy capacity. A good allocator does not chase maximum utilization alone. It protects critical work, rejects infeasible work early, gives best-effort work a bounded share, applies backpressure before queues collapse, and records why each quota or scheduling rule is safe for the deployment.

19.1 Start Simple

Picture a site gateway that has more requests than local compute, memory, network, or energy can serve at once. The core idea is to protect critical work before the queue collapses. Everyday IoT fog allocation starts with one priority rule, one rejection rule, one backpressure signal, and one quota that operators can explain. Build the allocation record first, then tune utilization after safety, fairness, and recovery behavior are visible.

Minimum Viable Understanding

Allocate to workload contracts, not to device names. A refrigerator alarm, a video summary, a firmware update, and a status heartbeat should not receive the same treatment.
Reserve before sharing. Safety, control, and recovery work need protected capacity before best-effort analytics use what remains.
Admission control is part of allocation. Refusing or deferring a task can be the correct answer when accepting it would break existing commitments.
Queues are signals. Queue depth, deadline misses, memory pressure, thermal state, and network backlog should change allocation decisions.
Rebalancing has a cost. Migrating containers, changing quotas, or moving tasks can interrupt sessions and should be gated by evidence.

19.2 Learning Objectives

By the end of this chapter, you will be able to:

Build a fog resource allocation record from workload contracts and measured capacity.
Separate hard feasibility gates from ranking and optimization rules.
Choose between reservations, quotas, priority queues, backpressure, load shedding, and migration.
Explain why resource allocation must include CPU, memory, accelerator, storage, network, energy, thermal, and failure-mode constraints.
Design a stable control loop that adjusts capacity without oscillating or starving low-priority work.
Write simple admission logic that protects reserved capacity before accepting best-effort work.

Quick Check: Fog Resource Allocation

Most Valuable Understanding

The correct allocation is the one that keeps the promised workloads inside their measured contracts during normal, burst, and degraded modes. High utilization is useful only after deadlines, safety, recovery, and fairness are protected.

19.3 Prerequisites

Fog Network Selection: Review how candidate links and fallback paths constrain available capacity.
Fog Energy and Latency: Review measured response budgets, energy budgets, and placement tradeoffs.
Fog Fundamentals Tradeoffs: Review how fog placement decisions should be recorded and challenged.

19.4 Why This Chapter Exists

Fog nodes sit close to devices, but they are not miniature clouds with unlimited headroom. A gateway may have a small CPU, limited memory, one accelerator, a flash device with finite write endurance, a shared uplink, and a thermal envelope that changes with the enclosure and ambient temperature. If every workload is admitted because the average looks fine, the first burst can break the very local response that justified fog placement.

Use this chapter when a design needs to answer questions like these:

“Which work gets protected capacity?” Reserve capacity for control, safety, recovery, and observability paths before best-effort work consumes the node.

“Can this node accept another workload?” Check deadline, memory, network, storage, energy, thermal, and fallback constraints before starting the task.

“What happens when load spikes?” Define throttling, priority, shedding, local buffering, and degraded-mode behavior before the spike occurs.

“When should work move?” Migrate only when the benefit exceeds the disruption risk and the destination has verified headroom.

19.5 The Allocation Surface

Resource allocation is broader than CPU percentages. A task can be CPU-light but memory-heavy, network-heavy, flash-write-heavy, accelerator-bound, or thermally risky. The allocation record should expose each scarce resource and the controls that protect it.

Figure 19.1: Fog resource allocation as multi-objective optimization: application requirements, resource availability, and system constraints feed an optimizer and constraint solver that place and size workloads.

19.5.1 Workload contract

Defines deadline, criticality, payload, rate, state, persistence, retry, and degraded-mode behavior.

19.5.2 Capacity snapshot

Measures CPU, memory, accelerator, storage, network, energy, thermal headroom, and reserved recovery capacity.

19.5.3 Allocation control

Combines admission gates, reservations, quotas, priorities, backpressure, shedding, and migration rules.

Knowledge Check: Allocation Contract

19.6 Build the Allocation Record

Write the resource allocation record before tuning scheduler knobs. The record should be short, measurable, and reviewable.

19.6.1 Allocation Record Template

Workload

Task name, traffic class, deadline, statefulness, persistence need, burst pattern, retry behavior, and criticality.

Resource envelope

Measured CPU, memory, accelerator, storage, network, energy, and thermal demand in normal, burst, and degraded modes.

Reserved capacity

Capacity held for control, alarm, recovery, local observability, and safe shutdown paths.

Admission rule

Conditions that must be true before a workload starts or receives more quota.

Sharing rule

Priority, fair-share, quota, deadline, or weighted policy used after feasibility gates pass.

Pressure response

Backpressure, degraded mode, shedding, migration, operator alert, and review trigger.

19.7 Measure Before You Allocate

Resource allocation should start with measurements that match the deployment, not with a single generic formula. Use representative payloads, competing traffic, expected storage writes, warm and cold starts, network failures, and the actual enclosure or power budget when possible.

19.7.1 CPU and accelerator

Record average and burst demand, startup spikes, hardware acceleration availability, fallback CPU cost, and whether the task is batch, streaming, or deadline-driven.

19.7.2 Memory and state

Record resident memory, buffer growth, model size, cache behavior, session state, and the cost of restart or migration.

19.7.3 Network and queues

Record ingress rate, egress rate, broker queue depth, retry bursts, uplink backlog, and whether the path is shared with alarms or management traffic.

19.7.4 Storage and write endurance

Record log volume, local buffer duration, database compaction, media writes, update downloads, and retention during WAN outages.

19.7.5 Energy and thermal state

Record power draw, duty cycle, battery reserve, enclosure temperature, throttling behavior, and recovery behavior after brownout or overheating.

19.7.6 Failure and recovery

Record what capacity remains during link outage, cloud outage, node restart, broker restart, credential renewal, and operator maintenance.

19.8 The Stable Allocation Model

A stable allocator is easier to audit when hard gates are separated from ranking. One practical model is:

Protect reserves: keep control_reserve, recovery_reserve, and observability_reserve available
Reject infeasible work: deadline_ok && memory_ok && network_ok && storage_ok && energy_ok && thermal_ok && fallback_ok
Admit work by class: critical: reserve and priority queue important: quota plus deadline monitoring best_effort: spare capacity only
React to pressure: throttle best_effort before important work shed optional work before violating critical deadlines migrate only when destination headroom and state cost are acceptable
Review the decision: reopen when deadline misses, queue growth, thermal throttling, flash pressure, retry storms, or repeated failovers appear

The model can use weighted fairness, deadline scheduling, token buckets, priority queues, or container limits. The important part is that the chosen mechanism is tied to the workload contract and verified under pressure.

Pitfall: A High Score Cannot Override a Hard Gate

Do not let a workload start because CPU looks available if it violates memory headroom, control reserve, storage endurance, thermal budget, or fallback behavior. Filter infeasible work first; optimize only among feasible choices.

19.9 Placement, Scheduling, and Admission Gates

A fog allocator has three related jobs. Placement chooses which node can host a task under CPU, memory, accelerator, storage, network, and energy limits. Scheduling orders accepted work using priority, quota, fair share, or earliest-deadline-first rules. Admission control decides whether new work may start at all.

The order matters. Admission and hard feasibility gates come before ranking because no scheduling score can make an infeasible workload safe.

Node capacity: 4 CPU units Arriving tasks: A = 2, B = 2, C = 1.5

2 + 2 + 1.5 = 5.5 > 4

A + B can fit exactly. C must wait, move, shrink, or be rejected.

Production systems often use fast heuristics such as first-fit, best-fit, priority classes, quotas, and earliest-deadline-first scheduling instead of exhaustive optimal search. That is reasonable only when the hard gates remain visible. A heuristic may choose among feasible placements; it must not hide that total demand exceeds the node, memory is below the floor, a broker queue threatens the alarm reserve, or migration would move too much state.

19.9.1 Placement

Fit tasks onto nodes that have the measured resources, authority, and fallback behavior needed by the workload contract.

19.9.2 Scheduling

Order accepted work so critical deadlines, quotas, and fairness promises are met under expected contention.

19.9.3 Admission

Reject, defer, shrink, or relocate new work before it breaks already accepted commitments.

19.9.4 Overcommitment

Accepting all work can make queues grow until critical and best-effort tasks fail together.

Knowledge Check: Capacity Gate

Interactive Quiz: Match the Control to the Risk

19.10 Allocation Control Loop

Fog allocation is not a one-time bin-packing exercise. The node must observe pressure, adjust work, and document when the allocation record is no longer valid.

A fog resource allocation loop showing measurement, feasibility gates, admission, scheduling, pressure response, fallback, and review triggers. — Figure 19.2: Fog resource allocation control loop.

1. Measure current pressure. Track queue age, deadline misses, memory headroom, network backlog, storage writes, power state, and thermal state.

2. Apply feasibility gates. Reject work that would consume protected reserve or violate a hard resource or fallback constraint.

3. Admit and schedule. Use reservations, quotas, priorities, or fair-share rules that match workload criticality and measured demand.

4. Respond to pressure. Throttle, shed, defer, buffer, or migrate in an order that preserves critical commitments.

5. Reopen the record. Treat repeated pressure, drift, deadline misses, or failover problems as evidence that the allocation policy needs review.

Interactive Quiz: Sequence the Allocation Loop

19.11 Allocation Pattern Guide

Treat these as starting patterns. The correct choice depends on measured demand, criticality, and failure mode.

19.11.1 Fixed Gateway With Critical Control

Use protected reserves and strict admission. Control, safety, and local recovery paths should keep capacity even when analytics, updates, or reporting jobs are busy.

19.11.2 Strong signal

The fog node directly affects equipment, local alarms, or operator safety.

19.11.3 Watch out

Average utilization can look healthy while queue age on the critical path is already unsafe.

19.11.4 Shared Fog Cluster

Use workload classes, quotas, and placement rules. The cluster may move best-effort work, but stateful or deadline-sensitive tasks need migration gates.

19.11.5 Strong signal

Multiple gateways share local compute and workloads can be distributed without breaking authority or state.

19.11.6 Watch out

Moving work can reset connections, duplicate messages, or overload the network path that carries state.

19.11.7 Bursty Media or ML Pipeline

Use queue limits, batch windows, accelerator reservations, and graceful degradation. Optional frames, lower-resolution summaries, or delayed inference can absorb pressure.

19.11.8 Strong signal

The workload is valuable but not every item has the same deadline or fidelity requirement.

19.11.9 Watch out

Accelerator memory, model loading time, and output upload backlog can be the bottleneck rather than CPU.

19.11.10 Disconnected or Intermittent Site

Use local storage budgets, buffer admission, summarization, retention tiers, and replay rules. The allocator must decide what is stored, summarized, or discarded during outage.

19.11.11 Strong signal

WAN outages are expected and the site still needs local visibility or control.

19.11.12 Watch out

Accepting all data into local storage can crowd out recovery logs, alarms, or future replay.

19.12 Worked Example: Clinic Cold-Storage Gateway

A clinic monitors medicine refrigerators. Normal temperature samples are small and delay tolerant. Out-of-range alarms must be processed locally, logged, and shown to staff even when the WAN is down. A gateway also runs a daily diagnostic summary and occasionally downloads updates.

19.12.1 Workload Contracts

19.12.2 Alarm path

Small events, high criticality, local controller response required, duplicate messages acceptable if the controller suppresses them.

19.12.3 Normal telemetry

Small periodic events, delay tolerant, can be buffered and replayed when the backhaul returns.

19.12.4 Daily diagnostics

Batch work, useful for maintenance, may be delayed during alarms, WAN outages, or thermal pressure.

19.12.5 Firmware update

High storage and network demand, must never consume alarm reserve or recovery storage.

19.12.6 Allocation Decision

The team reserves CPU, memory, broker queue, local log space, and network priority for alarm handling and recovery visibility. Normal telemetry receives a bounded queue and replay budget. Diagnostic summaries run only when queue age and thermal state are healthy. Firmware updates require an explicit maintenance window and are paused if alarm reserve, storage reserve, or uplink headroom is threatened.

19.12.7 Allocation Record

Protected reserve

Alarm processing, local alert display, broker health, recovery logs, and operator diagnostics.

Best-effort share

Daily summaries and cloud reporting use spare capacity after alarm and telemetry commitments are safe.

Admission gate

Do not start update or diagnostic work when alarm queue age, storage reserve, thermal state, or uplink backlog is outside limits.

Pressure response

Pause update, shrink diagnostic batch, summarize telemetry, preserve alarm logs, and notify operators if reserve is repeatedly threatened.

Review trigger

Reopen the record after refrigerator count changes, failed alarm drills, repeated WAN outages, firmware changes, or flash pressure.

19.13 Label the Allocation Record

Label Quiz: Resource Allocation Record

19.14 Code Challenge

Code Quiz: Admit Only Feasible Work

19.15 Common Mistakes

19.15.1 Optimizing average utilization first

Average utilization hides queue age, deadline misses, burst pressure, thermal throttling, and recovery reserve exhaustion.

19.15.2 Treating CPU as the only scarce resource

Memory, accelerator slots, flash writes, broker queues, uplink backlog, energy, and cooling can be the real limit.

19.15.3 Letting optional work grow unbounded queues

Backpressure, queue limits, and shedding rules are safer than pretending storage and retry time are infinite.

19.15.4 Migrating without state evidence

Moving a task can duplicate commands, lose local cache, reset TLS sessions, or overload the destination link.

19.15.5 Forgetting observability reserve

A node under pressure still needs enough capacity to report health, expose logs, and accept operator actions.

19.15.6 Testing only normal mode

Allocation must be tested during bursts, outages, restarts, update windows, and degraded operation.

19.16 Review Checklist

Before signing off a fog resource allocation, verify:

Every workload has a contract with criticality, deadline, rate, state, persistence, and degraded-mode behavior.
Capacity was measured for CPU, memory, accelerator, storage, network, energy, thermal state, and recovery reserve.
Critical, control, recovery, and observability paths have protected capacity.
Admission gates reject work before it threatens hard commitments.
Scheduling policy matches workload class: reserve, quota, priority, deadline, or fair share.
Backpressure, queue limits, and load shedding are defined for overload.
Migration rules include destination headroom, state cost, duplicate handling, and rollback.
Best-effort work cannot consume alarm, recovery, or local observability reserves.
Review triggers are written for deadline misses, queue growth, thermal throttling, storage pressure, retry storms, and failed drills.

19.17 Reference Notes

TCP-style feedback, operating-system scheduling, container resource controls, queueing theory, and distributed-systems admission control all inform fog allocation, but none of them replaces local measurement.
Use standards and platform features to express limits, priorities, health checks, and isolation. Use deployment tests to decide whether the resulting policy meets the workload contract.
Treat vendor capacity, accelerator, and benchmark claims as planning inputs. They become engineering evidence only after the node, enclosure, firmware, workload, and network path have been measured.

19.18 Summary

Fog resource allocation is a measured capacity decision. Start with workload contracts, protect critical reserves, reject infeasible work, admit and schedule feasible work, apply backpressure before queues fail, and reopen the record when pressure evidence appears. The strongest designs explain what is allowed to run, what must wait, what can be shed, and what must keep working during degraded modes.

19.19 Concept Relationships

19.19.1 Network selection

The selected link constrains queueing, backpressure, fallback, and migration options from Fog Network Selection.

19.19.2 Energy and latency

Allocation decisions change local response time, device energy, thermal behavior, and retry cost from Fog Energy and Latency.

19.19.3 Use-case placement

The workload contract comes from the real use case, not from a generic allocation template. Continue with Fog Optimization and Privacy.

19.19.4 Production governance

Resource records become production review evidence for capacity planning, incident response, and operational handoff.

19.20 What’s Next

19.20.1 Fog Optimization and Privacy

Use allocation evidence to decide which workload patterns fit privacy-sensitive and local-control use cases in Fog Optimization and Privacy.

19.20.2 Fog Optimization Examples

Apply the allocation record together with energy, latency, and network-selection evidence in Fog Optimization and Examples.

19.20.3 Production Framework

Carry resource reserves, admission gates, and review triggers into production governance in Fog Production Framework.

19.21 Key Takeaway

Fog resources are finite. Allocate CPU, memory, storage, and bandwidth by workload priority, isolation needs, and failure consequences so critical local services keep capacity under stress.