Skip to main content

Import CSV file from Google Drive to Big Query

I have written about using App Scripts to collect telemetry data and the ease of integrating it to Google Sheet to chart the temperature and humidity reading. In this post, I shall describe the approach using a CSV uploaded via Google File Stream and a scheduled job which is written in App Script to load the data into Big Query. This approach is very good for integrating the transactional capability of ERP system and the Analytics capability of Visualisation system (Data Studio).

ERP system has rich built-in business process workflow and transactional control, When performing Dashboard and Data Analytics functionality, it will slow down the entire system. In this case, it may be better for analytics features to be performed by another system. Fortunately, most ERP system is capable of auto execute a query to export CSV data. I have made use of this capability to import the daily files generated by ERP to Big Query via Google Drive. This CSV file is picked up by an App Script to import the file to Big Query.

App Script Setup

There are a couple of setups that are needed before the App Script is able to link up with Google Cloud Platform services.

1. Linkage to Google Cloud Platform. A new GCP project will be associated with the App Script project. I was trying to associate with an existing project but it does not recognise the existing project.

2. Enable Cloud APIs .There are 2 APIs to be enabled (Google Drive and BigQuery API). This has to be enabled at GCP and GAS. (This is a 2 steps process which I hope Google can simplify)


The APP Script Import Flow


I am using GSuite so Drive File Stream works for me. This utility program automatically syncs the CSV file from my computer to Google Drive. At Google Drive, I have written an App Script that periodically monitors the folder for the existence of a certain file e.g. sample_yyyymmdd.csv. When the file(s) are found, the scripts will automatically import the file into Big Query.
I have created 2 folders in Drive, "error" and "done". When the file is successfully imported in Big Query, the script will move the file to "done" folder or otherwise. 

Big Query Import Considerations

Big Query is very efficient in read operations but insert operations, update and delete of a record is very inefficient. In fact, Google has imposed quota on these operations. To get around this, when importing, I have created 1 table per file. This technique has another advantage whereby if the data for a particular day is incorrect, I only need to need to drop the table for that day and re-import the corrected data.

The App Script code can be found here.

Comments

Popular posts from this blog

DIY Sonoff RF Bridge

Tasmota recently supported the RF bridge by iTead which allowed for RF 433 remote devices to be controlled/controlling using Sonoff products. e.g A handphone can be used to control a remote RF device or an RF remote control can use to control a WIFI enabled device.

By using Tasmota, I can control the device using MQTT. This idea fits exactly to my smart home plan. Here is the schematic:



I cheated a bit by having ESP01 mounted on a breakout board. This breakout board converts all the external pins to 5v which simplified the wiring and I do not need to use a voltage level shifter. Moreover, the Arduino 3.3v power does not have enough current to power the ESP reliably and I have to power it via the 5V from Arduino Uno. Next, I used an Arduino Uno prototype shield with a tiny breadboard to hold all the components together. For the RF 433 transmitter/receiver, I am using the development kits when I purchased the all-in-one Arduino development kits. An antenna is added for more sensitivity.



Using ESP-Link transparent bridge (ESP-01 and Arduino Pro Mini)

Recently stumbled across an interesting open source project ESP-Link. Its main purpose is to network-enable a non-network microcontroller (MCU) such as Arduino Uno, Pro mini or Nano using ESP8266. The author termed it as "Transparent Bridge". The ESP and MCU  communicate via the serial link and there is a companion Arduino library EL-Client for the MCU to connect up the network using MQTT, REST, TCP and UDP.
Setup I have put together an ESP-01 and an Arduino Pro Mini for this experiment. I have chosen a 3.3 version Pro mini so that I do not need to do any voltage level shifting between the I/O pins. In order to have a stable voltage source, the ESP8266 is powered by Pro Mini and the Pro Mini "RAW" pin is connected to a 5v USB power source. The RAW pin can take voltage up to 12V. The reset pin of Pro Mini is connected to GPIO 0 of ESP-01. This enables the ESP-01 to reset the Pro Mini.




I have linked up an APDS 9960 sensor to it and periodically send the luminosity a…