Ben Chuanlong Du's Blog

It is never too late to learn.

Rust and Spark

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

The simplest and best way is to leverage pandas_udf in PySpark. In the pandas UDF, you can call subprocess.run to run any shell command and capture its output.

from pathlib …

Calling Shell from Python

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

  1. subprocess.run is preferred to the function os.system for invoking shell commands. For more discussions, pleaser refer to [Hands on the Python module subprocess]https://www.legendu.net/en/blog …