data_infrastructure:iniital_proposal

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
data_infrastructure:iniital_proposal [2016/06/21 02:47]
kluong
data_infrastructure:iniital_proposal [2021/09/19 21:59] (current)
Line 6: Line 6:
  
 The purpose of the data infrastructure is to create a platform that can be used to easily collect and store sensor data from any source. This platform will eventually fully support the efforts of the forecasting teach, which given good access to data will be able to do their own independent research. ​ The purpose of the data infrastructure is to create a platform that can be used to easily collect and store sensor data from any source. This platform will eventually fully support the efforts of the forecasting teach, which given good access to data will be able to do their own independent research. ​
 +
 +https://​www.draw.io/#​G0Bxowpw1NF2d3YmRaRjBHWTJTckE
 ====== Outline ====== ====== Outline ======
  
Line 13: Line 15:
 IV. Technical Modules ​ \\ IV. Technical Modules ​ \\
  
-====== ​Motivation ​====== +====== ​Previous Work ======
- +
-Currently, the core project in the smart campus energy lab involves collecting data. +
  
-Issues with the previous project:+**Issues with the previous project**
  
   * Documentation was poor   * Documentation was poor
Line 25: Line 25:
   * Limited to one data type   * Limited to one data type
  
 +====== Motivation and Summary ======
  
-====== Goals ======+It is currently very difficult to reliably gather time series data from embedded sensor devices. 
 +This project aims to provide the software infrastructure to reliably collect data, add new 
 +sensors, extend new sensor types and analyze such data. 
  
-  * Highly available 
-  * Easy to interface with 
-  * Easy to contribute to 
-  * Communicate development well 
  
 +====== Specifications ======
  
 +High Level Categories:
 +
 +  * Availability
 +  * Interfaces
 +  * Libraries
 +  * Graphing
 +  * Contributions
 +  * Documentation
 +  * Extend-ability
 +  * Logging ​
 +  * Verification
 +  * Validation
 +
 +
 +Misc:
 +
 +  * Outside Users should be able to easily view nodes publicly ​
 +  * Each node deployment should be able to be tracked
 +  * Lab users should be able to download datasets using any scripting language
 +  * We should be able to validate the data that is collected
 +  * We should be able to scan if a sensor is down or not
 +
 +
 +Okay this is really hard.
 ====== Technical Modules ====== ====== Technical Modules ======
  
 Here is a block diagram: Here is a block diagram:
  
-{{ :​data_infrastructure:​sensor_infrastructure_proposal.png?​direct&​300 |}}+{{ :​data_infrastructure:​sensor_infrastructure_proposal_2.png?​direct&​300 |}}
  
 +===== High Level Blocks =====
  
  
   * **Client** - Primary interface into the data infrastructure - sensors with transport layers such as ZigBee will dump their data to these clients.   * **Client** - Primary interface into the data infrastructure - sensors with transport layers such as ZigBee will dump their data to these clients.
   * **Messaging Bus/​Gateway** - Monitors all of the clients and makes sure that they are authorized to send data. Rejects invalid clients.   * **Messaging Bus/​Gateway** - Monitors all of the clients and makes sure that they are authorized to send data. Rejects invalid clients.
-  * **Compute / Data Backend** - Responsible for processing packets ​as they come in and putting them into the database.+  * ** Data Backend** - Contains all of the logic necessary to store and process data. Contains a publicly accessible API that can be used to build client applications. 
 +  * ** Compute Backend** - Able to run large compute jobs such as graphing or analysis scripts. Serves dataset results ​and graphs through a filesystem or the API 
 +===== Client =====
  
-Client: 
  
   * **Client Gateway**   * **Client Gateway**
Line 52: Line 78:
   * **Client Connector**   * **Client Connector**
  
-Messaging Bus:+===== Messaging Bus ===== 
  
   * **Reverse Proxy/​Balancer**   * **Reverse Proxy/​Balancer**
Line 59: Line 86:
   * **Gateway Queue**   * **Gateway Queue**
  
-Compute/​Data Backend:+===== Compute/​Data Backend ​===== 
  
   * **Worker** - Processes data and makes sure that they    * **Worker** - Processes data and makes sure that they 
Line 66: Line 94:
   * **Core database** - Database that stores all of our data. Currently postgresql.   * **Core database** - Database that stores all of our data. Currently postgresql.
   * **Mirror database** - Public database that is RO for public users. Mirrored from the core database.   * **Mirror database** - Public database that is RO for public users. Mirrored from the core database.
-  * **Master ​Queue** - Main queue that exists between the gateway and the worker scripts. This makes it possible to upgrade the gateway without losing any data in the network. +  * **Gateway ​Queue** - Main queue that exists between the gateway and the worker scripts. This makes it possible to upgrade the gateway without losing any data in the network.
-  * **Client Queues** - Queues that exist on the client systems to buffer against possible disconnects from the main server+
  
 ====== System Validation ====== ====== System Validation ======
Line 95: Line 122:
   * Sensor Platform   * Sensor Platform
   * Data Sensor Platform   * Data Sensor Platform
- 
  • data_infrastructure/iniital_proposal.1466477266.txt.gz
  • Last modified: 2021/09/19 21:59
  • (external edit)