====== Feature Idea - Staging Environment ====== ===== Background ===== A typical pattern that is used when running systems is to have multiple environments. There's the production environment, which handles the "real" data or traffic, and usually a staging system that consists of "test" or fake data that has no impact on what happens in the real system. This staging environment is useful because it lets you test changes to your software or configuration in an environment that is pretty close to what it looks like in production, giving your confidence that your changes won't break anything in the real world. ===== Motivation ===== It would be great if we had one for our sensor network - that way we could test changes to any part of the network, and see if they would work in production. This would be useful to test gateway or database changes for example. * would be useful for testing * good educational experience being able to walk through the whole process * one outcome from setting up the staging environment - writing a document that tells the story of what we had to do to set it up (useful for others to read later on) In the past, folks have sorta set something up like this - the "test" gateway that's running on a raspberry pi. That's a great test platform, and allows the firmware and software team to test things, but the only difference is it's not something that's running all the time. A true staging environment should be running all the time, and should be free for anyone to use for testing whenever they need to. ===== Implementation ===== Here is a starting list of tasks - this can be added to later on: * Go through tasks, put it onto a google doc for collaboration / documentation * Get a new xbee and usb-serial adapter for use as a coordinator * Configure the new xbee with the proper parameters and a new PANID for the staging env * Plug in the new xbee to scelserver-1 * Create a new postgres database for staging * Create new database credentials for staging * Create the actual tables required for the gateway in the staging database * Create a new gateway instance on scelserver-1 * Create a new systemd service for the staging gateway * Setup a stubbed weatherbox node that transmits fake samples to the new staging xbee * Create a grafana dashboard to check on the status of the staging network ===== Technical Details ===== ==== Systemd Intro ==== We currently use systemd to run the production gateway. Systemd is primarily used by us to manage the gateway process, and make sure that it starts up when the system reboots and well if the process crashes somehow. More details: * https://en.wikipedia.org/wiki/Systemd **Simple Example** Here's a simple example to provide a bit more context. Let's say I had a server sitting somewhere that I wanted to run a website on. I can use python's SimpleHTTPServer to do that. This is a pretty simple, one command webserver that will serve up files in a directory so they're accessible in a browser. So let's create a directory and run our webserver: kluong@kserver:~$ mkdir test-webserver kluong@kserver:~$ cd test-webserver/ kluong@kserver:~/test-webserver$ ls kluong@kserver:~/test-webserver$ echo "hello world" > index.html kluong@kserver:~/test-webserver$ python -m SimpleHTTPServer 7000 Serving HTTP on 0.0.0.0 port 7000 ... In a browser, if I end up going to localhost:7000, I'll see "hello, world". You can also use the `curl` command in a separate terminal window: kluong@kserver:~/test-webserver$ curl localhost:7000 hello world Now this is great, but if you close the terminal window or reboot the machine, the website will be unavailable since the program isn't running anymore. You need something to manage this program, and make sure that it gets run every time the machine is started, and without having someone having to open up a terminal window. This is where systemd comes in. A systemd config file could look something like this: [Unit] Description=My Website [Service] Type=simple User=kluong WorkingDirectory=/home/kluong/webserver ExecStart=/usr/bin/python -m SimpleHTTPServer 7000 Restart=on-failure [Install] WantedBy=multi-user.target TODO: finish this section **Details from the production gateway** You can inspect the current systemd file by going to scelserver-1: You can run `systemctl status` to check where the unit file is defined: kluong@scelserver-1:~$ systemctl status xbee-gateway ● xbee-gateway.service - Scel XBee Gateway Loaded: loaded (/etc/systemd/system/xbee-gateway.service; enabled; vendor preset: enabled) Active: active (running) since Tue 2021-04-20 14:26:43 HST; 5 months 17 days ago Main PID: 661 (run_production.) CPU: 16h 58min 11.722s CGroup: /system.slice/xbee-gateway.service ├─661 /bin/bash /home/scel/control-tower/gateway/run_production.sh auto └─725 python gateway_server.py auto Looks like it's defined at /etc/systemd/system/xbee-gateway.service. Let's take a look at it: [Unit] Description=Scel XBee Gateway After=network.target After=xbee-pty-bridge.service [Service] Type=simple # Another Type option: forking User=scel WorkingDirectory=/home/scel/control-tower/gateway ExecStart=/home/scel/control-tower/gateway/run_production.sh auto Restart=on-failure # Other Restart options: or always, on-abort, etc [Install] WantedBy=multi-user.target It's possible to re-use most of this existing systemd configuration to create the staging staging configuration, the directories and names just have to be changed properly. ==== Database Changes ==== We'll be sharing the same postgres instance as our production system - that way we don't have to have a completely new instance of postgres. What we'll want, however is a separate database from what the production data is using, so we can isolate things a bit. Let's take a look at what the production instance is using. Turns out, our gateway is hardcoded to use certain values: https://github.com/scel-hawaii/control-tower/blob/master/gateway/src/decoder.py#L127-L128 con = psycopg2.connect("dbname='control_tower' user='control_tower' password='' host='localhost'") Looks like it's configured to connect to localhost and use the 'control_tower' database through the 'control_tower' user. We'll have to change this so it's not hardcoded if we want to use a different database for the staging instance. But we can revisit that in another change later on. So we'll need a new database and a new user - how can we do that? === Creating the database and user === Here are some docs from the postgres website: * https://www.postgresql.org/docs/9.0/sql-createuser.html * https://www.postgresql.org/docs/9.0/sql-createdatabase.html It looks like we can create a new user within postgres by using: CREATE USER control_tower_staging; And it looks like we can create a new database using: CREATE DATABASE control_tower_staging OWNER control_tower_staging; To be able to do this, you'll need to be be logged in as a user with the `superuser` permission, which the `postgres` user typically has. To do this on the server, you'll just need to switch to the postgres user and run the `psql` command to get a terminal where you can run these commands. sudo su postgres psql **Note** - be sure to take care when using sudo! This gives you access to do a lot of things to the existing system, including removing system files that you normally wouldn't be able to. To check the users, use the ''%%\du%%'' command in postgres: postgres=# \du List of roles Role name | Attributes | Member of -----------------------+------------------------------------------------+----------- control_tower | | {} control_tower_ro | | {} control_tower_staging | | {} kluong | | {} postgres | Superuser, Create role, Create DB, Replication | {} To check the databases, use the ''%%\l%%'' command in postgres: postgres-# postgres-# \l List of databases Name | Owner | Encoding | Collate | Ctype | Access privileges ---------------+---------------+----------+-------------+-------------+----------------------------- bears | control_tower | UTF8 | en_US.UTF-8 | en_US.UTF-8 | control_tower | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =Tc/postgres + | | | | | postgres=CTc/postgres + | | | | | control_tower=CTc/postgres + | | | | | control_tower_ro=c/postgres kluong | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =Tc/postgres + | | | | | postgres=CTc/postgres + | | | | | kluong=CTc/postgres postgres | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | template0 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres + | | | | | postgres=CTc/postgres template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres + | | | | | postgres=CTc/postgres === Populating the tables === Okay, now the database and user are setup, but we'll also need to setup the tables for the database. There's a schema file in the control_tower repo under the db/ folder that will let you do this. Make sure the current directory you're in is the control_tower repo. Here's the command you can run as the postgres user: psql -U control_tower_staging -d control_tower_staging -f db/multi-table.sql After this is done, this is all the database configuration you'll need to do. There are some additional gateway changes described below. You can check that the tables were created properly by running the following command: psql -U control_tower_staging -d control_tower_staging -c "SELECT * FROM pg_catalog.pg_tables WHERE schemaname='public'" ==== Gateway Change - ability to pass in a database URI ==== I mentioned earlier we'll have to change the way the gateway works. This is the way the current code connects to the database: con = psycopg2.connect("dbname='control_tower' user='control_tower' password='' host='localhost'") https://github.com/scel-hawaii/control-tower/blob/master/gateway/src/decoder.py#L127-L128 What we want to be able to do is specify the database configuration in a separate manner - either using a configuration file, flag or an environment variable. The environment variable is probably the easiest here, so let's go with that. Python programs can read in variables that are set by the environment: https://able.bio/rhett/how-to-set-and-get-environment-variables-in-python--274rgt5 So we can modify the code to do something like this: db_uri = os.environ["GATEWAY_DB_URI"] con = psycopg2.connect(db_uri) We can also use a URI instead of passing in the arguments in the way we did previously. A shell script that calls the gateway would look something like this for staging: #!/bin/bash # setup the python env source ./env/bin/activate export GATEWAY_DB_URI="postgresql://control_tower_staging@localhost/control_tower_staging" python gateway.py /dev/serial/by-id/usb-FTDI_FT231X_USB_UART_DN01DS3L-if00-port0 You'd need to see some RX frames on the tty device to test this properly though, since the gateway doesn't reach out to the database unless it sees a message to decode. To get this to work in production, you would have to modify the run_production.sh script, and also update the repository production. ==== Gateway - testing changes ==== Options for testing * Deploy to production - see what happens * Deploy to staging - see what happens * Setup an personal network (on your laptop, or maybe on the server) * Connect the gateway to a "fake" xbee virtually, using a pty and another script * Mock out the serial device completely in python It's not necessarily practical to have hardware to test with all of the time and often hardware might not be available anyways. Luckily, with software we can work around this. **Note: development for the gateway should generally be done in a linux-based environment.** The gateway code was primarily designed only for linux-based environments and may not run locally. You can make a "fake" xbee network using the following python script: import os import errno import time def symlink_force(target, link_name): try: os.symlink(target, link_name) except OSError as e: if e.errno == errno.EEXIST: os.remove(link_name) os.symlink(target, link_name) else: raise e def valid_packets(): packets = {} packets['heartbeat'] = "\x7e\x00\x16\x90\x00\x7d\x33\xa2\x00\x40\xe6\x4b\x5e\x03\xfd\x01\x00\x00\xff\xff\xf0\xfa\x23\x00\x2b\x02\xb2" packets['apple'] = "\x7e\x00\x22\x90\x00\x7d\x33\xa2\x00\x40\x9f\x27\xa7\x29\x6c\x01\x01\x00\xff\xff\x80\x6f\x69\x3d\x06\x0f\x71\x7d\x33\x5a\x8a\x01\x00\x76\x01\x22\x00\x6e\x09\x55" packets['cranberry'] = "\x7e\x00\x22\x90\x00\x7d\x33\xa2\x00\x41\x25\xe5\x88\x0c\x83\x01\x02\x00\xff\xff\x7c\xf3\x05\x00\xba\x0f\x5c\x08\x05\x00\x20\x73\x3b\x00\xdd\x8b\x01\x00\x7a" packets['dragonfruit'] = "\x7e\x00\x24\x90\x00\x7d\x33\xa2\x00\x40\xe6\x72\x7d\x5e\x30\x18\x01\x03\x00\xff\xff\x30\xc8\x07\x00\x6b\x0d\xf4\x00\x06\x00\x00\x00\xb6\x72\x37\x00\xfe\x8b\x01\x00\x00" packets['snapdragon'] = "\x7e\x00\x22\x90\x00\x7d\x33\xa2\x00\x40\xa3\x53\x7d\x5e\x20\x9c\x01\x04\x00\xff\xff\x12\xe4\x49\x00\xca\x0d\x44\x0c\x2c\x31\x01\x00\x2f\x01\x34\x00\x64\x00\xbb" return packets master_fd, slave_fd = os.openpty() symlink_force(os.ttyname(slave_fd), '/tmp/fakexbee') packets = valid_packets() while True: for key in packets: os.write(master_fd, bytes(packets[key])) time.sleep(1)