Ben Chuanlong Du's Blog

It is never too late to learn.

Git Implementations and Bindings in Python

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

There are multiple Git implementations/bindings in Python: pygit2, Dulwich and GitPython .

Below is a simple comparison of the 3 packages.

pygit2 dulwich GitPython
Implementation bindings to libgit2 pure Python bindings …

Improve the Performance of Spark

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Plan Your Work

  1. Have a clear idea about what you want to do is very important, especially when you are working on an explorative project. It often saves you time to …

Tips on Python Build Standalone

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

The GitHub repository python-portable has some example scripts for bundling standalone Python environments. It also releases standalone Python environemnts regularly.

Tips on Using env_python.tar.gz

This section is specifically on …

Spark Issue: IllegalArgumentException: Wrong FS

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Symptoms

java.lang.IllegalArgumentException: Wrong FS: hdfs://..., expected: viewfs://...

Possible Causes

The Spark cluster has migrated to Router-based Federation (RBF) namenodes, and viewfs:// (instead of hdfs://) is required to access HDFS …

Spark Issue: ViewFs: Cannot Initialize: Empty Mount Table in Config

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Symptoms

java.io.IOException: ViewFs: Cannot initialize: Empty Mount table in config for viewfs://cluster-name-ns02/

Possible Causes

As the error message says, viewfs://cluster-name-ns02 is not configured.

  1. It is possible that …