Ben Chuanlong Du's Blog

It is never too late to learn.

Rust and Spark

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

The simplest and best way is to leverage pandas_udf in PySpark. In the pandas UDF, you can call subprocess.run to run any shell command and capture its output.

from pathlib …

Yarn for Spark

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

  1. List all Spark applications.

    yarn application --list
    
  2. Show status of a Spark application.

    yarn application -status application_1459542433815_0002
    
  3. view logs of a Spark application.

    yarn logs -applicationId application_1459542433815_0002
    
  4. kill a Spark application …

Zellij Is the Best Terminal Multiplexer

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

zellij options --disable-mouse-mode

https://github.com/zellij-org/zellij

Persistent Sessions

A detached session becomes a persistent session.

ctrol + o: d

You can re-attach a session using

zellij attach session_name