All Projects → davidtw0320 → Resources-Allocation-in-The-Edge-Computing-Environment-Using-Reinforcement-Learning

davidtw0320 / Resources-Allocation-in-The-Edge-Computing-Environment-Using-Reinforcement-Learning

Licence: other
Simulated the scenario between edge servers and users with a clear graphic interface. Also, implemented the continuous control with Deep Deterministic Policy Gradient (DDPG) to determine the resources allocation (offload targets, computational resources, migration bandwidth) in the edge servers

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Resources-Allocation-in-The-Edge-Computing-Environment-Using-Reinforcement-Learning

Hindsight Experience Replay
This is the pytorch implementation of Hindsight Experience Replay (HER) - Experiment on all fetch robotic environments.
Stars: ✭ 134 (-15.19%)
Mutual labels:  ddpg
Tianshou
An elegant PyTorch deep reinforcement learning library.
Stars: ✭ 4,109 (+2500.63%)
Mutual labels:  ddpg
pytorch-distributed
Ape-X DQN & DDPG with pytorch & tensorboard
Stars: ✭ 98 (-37.97%)
Mutual labels:  ddpg
Minimalrl
Implementations of basic RL algorithms with minimal lines of codes! (pytorch based)
Stars: ✭ 2,051 (+1198.1%)
Mutual labels:  ddpg
Deeprl
Modularized Implementation of Deep RL Algorithms in PyTorch
Stars: ✭ 2,640 (+1570.89%)
Mutual labels:  ddpg
Pytorch Ddpg Naf
Implementation of algorithms for continuous control (DDPG and NAF).
Stars: ✭ 254 (+60.76%)
Mutual labels:  ddpg
Reinforcement Learning
🤖 Implements of Reinforcement Learning algorithms.
Stars: ✭ 104 (-34.18%)
Mutual labels:  ddpg
Rainy
☔ Deep RL agents with PyTorch☔
Stars: ✭ 39 (-75.32%)
Mutual labels:  ddpg
Pytorch Drl
PyTorch implementations of various Deep Reinforcement Learning (DRL) algorithms for both single agent and multi-agent.
Stars: ✭ 233 (+47.47%)
Mutual labels:  ddpg
reinforcement learning with Tensorflow
Minimal implementations of reinforcement learning algorithms by Tensorflow
Stars: ✭ 28 (-82.28%)
Mutual labels:  ddpg
Deep Reinforcement Learning Algorithms
31 projects in the framework of Deep Reinforcement Learning algorithms: Q-learning, DQN, PPO, DDPG, TD3, SAC, A2C and others. Each project is provided with a detailed training log.
Stars: ✭ 167 (+5.7%)
Mutual labels:  ddpg
Rlcycle
A library for ready-made reinforcement learning agents and reusable components for neat prototyping
Stars: ✭ 184 (+16.46%)
Mutual labels:  ddpg
DDPG Torcs PyTorch
Using PyTorch and DDPG to play Torcs
Stars: ✭ 44 (-72.15%)
Mutual labels:  ddpg
Machin
Reinforcement learning library(framework) designed for PyTorch, implements DQN, DDPG, A2C, PPO, SAC, MADDPG, A3C, APEX, IMPALA ...
Stars: ✭ 145 (-8.23%)
Mutual labels:  ddpg
Deep-Reinforcement-Learning-With-Python
Master classic RL, deep RL, distributional RL, inverse RL, and more using OpenAI Gym and TensorFlow with extensive Math
Stars: ✭ 222 (+40.51%)
Mutual labels:  ddpg
Deep Reinforcement Learning In Large Discrete Action Spaces
Implementation of the algorithm in Python 3, TensorFlow and OpenAI Gym
Stars: ✭ 132 (-16.46%)
Mutual labels:  ddpg
Train Robot Arm From Scratch
Build environment and train a robot arm from scratch (Reinforcement Learning)
Stars: ✭ 242 (+53.16%)
Mutual labels:  ddpg
nips rl
Code for NIPS 2017 learning to run challenge
Stars: ✭ 37 (-76.58%)
Mutual labels:  ddpg
UAV-DDPG
Code for paper "Computation Offloading Optimization for UAV-assisted Mobile Edge Computing: A Deep Deterministic Policy Gradient Approach"
Stars: ✭ 133 (-15.82%)
Mutual labels:  ddpg
mujoco-benchmark
Provide full reinforcement learning benchmark on mujoco environments, including ddpg, sac, td3, pg, a2c, ppo, library
Stars: ✭ 101 (-36.08%)
Mutual labels:  ddpg

Resources Allocation in The Edge Computing Environment Using Reinforcement Learning

Summary

The cloud computing based mobile applications, such as augmented reality (AR), face recognition, and object recognition have become popular in recent years. However, cloud computing may cause high latency and increase the backhaul bandwidth consumption because of the remote execution. To address these problems, edge computing can improve response times and relieve the backhaul pressure by moving the storage and computing resources closer to mobile users.

Considering the computational resources, migration bandwidth, and offloading target in an edge computing environment, the project aims to use Deep Deterministic Policy Gradient (DDPG), a kind of Reinforcement Learning (RL) approach, to allocate resources for mobile users in an edge computing environment.

gui picture originated from: IEEE Inovation at Work


Prerequisite

  • Python 3.7.5
  • Tensorflow 2.2.0
  • Tkinter 8.6

Build Setup

Run The System

$ python3 src/run_this.py

Text Interface Eable / Diable (in run_this.py)

TEXT_RENDER = True / False

Graphic Interface Eable / Diable (in run_this.py)

SCREEN_RENDER = True / False

Key Point

Edge Computing Environment

  • Mobile User

    • Users move according to the mobility data provided by CRAWDAD. This data was collected from the users of mobile devices at the subway station in Seoul, Korea.
    • Users' devices offload tasks to one edge server to obtain computation service.
    • After a request task has been processed, users need to receive the processed task from the edge server and offload a new task to an edge server again.
  • Edge Server

    • Responsible for offering computational resources (6.3 * 1e7 byte/sec) and processing tasks for mobile users.
    • Each edge server can only provide service to limited numbers of users and allocate computational resources to them.
    • The task may be migrated from one edge server to another within limited bandwidth (1e9 byte/sec).
  • Request Task: VOC SSD300 Objection Detection

    • state 1 : start to offload a task to the edge server
    • state 2 : request task is on the way to the edge server (2.7 * 1e4 byte)
    • state 3 : request task is proccessed (1.08 * 1e6 byte)
    • state 4 : request task is on the way back to the mobile user (96 byte)
    • state 5 : disconnect (default)
    • state 6 : request task is migrated to another edge server
  • Graphic Interface

    gui

    • Edge servers (static)
      • Big dots with consistent color
    • Mobile users (dynamic)
      • Small dots with changing color
      • Color
        • Red : request task is in state 5
        • Green : request task is in state 6
        • others : request task is handled by the edge server with the same color and is in state 1 ~ state 4

Deep Deterministic Policy Gradient (in DDPG.py)

  • Description

    While determining the offloading server of each user is a discrete variable problem, allocating computing resources and migration bandwidth are continuous variable problems. Thus, Deep Deterministic Policy Gradient (DDPG), a model-free off-policy actor-critic algorithm, can solve both discrete and continuous problems. Also, DDPG updates model weights every step, which means the model can adapt to a dynamic environment instantly.

  • State

      def generate_state(two_table, U, E, x_min, y_min):
          one_table = two_to_one(two_table)
          S = np.zeros((len(E) + one_table.size + len(U) + len(U)*2))
          count = 0
          for edge in E:
              S[count] = edge.capability/(r_bound*10)
              count += 1
          for i in range(len(one_table)):
              S[count] = one_table[i]/(b_bound*10)
              count += 1
          for user in U:
              S[count] = user.req.edge_id/100
              count += 1
          for user in U:
              S[count] = (user.loc[0][0] + abs(x_min))/1e5
              S[count+1] = (user.loc[0][1] + abs(y_min))/1e5
              count += 2
          return S
    • Available computing resources of each edge server
    • Available migration bandwidth of each connection between edge servers
    • Offloading target of each mobile user
    • Location of each mobile user
  • Action

    def generate_action(R, B, O):
      a = np.zeros(USER_NUM + USER_NUM + EDGE_NUM * USER_NUM)
      a[:USER_NUM] = R / r_bound
      # bandwidth
      a[USER_NUM:USER_NUM + USER_NUM] = B / b_bound
      # offload
      base = USER_NUM + USER_NUM
      for user_id in range(USER_NUM):
          a[base + int(O[user_id])] = 1
          base += EDGE_NUM
      return a
    • Computing resources of each mobile user's task need to uses(continuous)
    • Migration bandwidth of each mobile user's task needs to occupy (continuous)
    • Offloading target of each mobile user (discrete)
  • Reward

    • Total processed tasks in each step
  • Model Architecture

    ddpg architecture


Simulation Result

  • Simulation Environment

    • 10 edge servers with computational resources 6.3 * 1e7 byte/sec
    • Each edge server can provide at most 4 task processing services.
    • 3000 steps/episode, 90000 sec/episode
  • Result

    Number of Clients Average Total proccessed tasks in the last 10 episodes Training History
    10 11910 result
    20 23449 result
    30 33257 result
    40 40584 result

Demo

  • Demo Environment

    • 35 mobile users and 10 edge servers in the environment
    • Each edge server can provide at most 4 task processing services.
  • Demo Video

    demo video


Reference

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].