Ben Chuanlong Du's Blog

It is never too late to learn.

Koalas is pandas API on PySpark

Run Commands on Remote Machines

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

On a Sinsgle Machine

SSH

  1. The pipeline command is run locally. If you want the pipeline command to run remotely, place the whole command to be run remotely in double/single …

Hands on the json Module in Python

Tips and Traps

  1. It is suggested that you avoid using JSON for serializing and deserializing data. Please refer to Shotcomes of JSON for detailed discussions on this. TOML and YAML are better text-based alternatives to JSON. If serialization and deserialization is done in Python only, pickle

Hands on the requests Module in Python

Comments

  1. It is suggested that you use the requests module instead of urllib unless you want to have minimal 3rd-party dependencies.

  2. Response.raise_for_status is a convenient method for raising an exception corresponding to the HTTP status code.