Convert Between a Pandas.Series and a Dict-Like Object

Jul 31, 2020

Sort Python Imports Using isort

May 16, 2020

Installation¶

isort can be installed with the following command.

pip3 install isort

Configuration¶

There are 2 recommened ways to configure isort. The first recommended way is to place a file named .isort.cfg at the root of your project. For example,

[settings]
line_length=120
force_to_top=file1.py,file2.py
skip=file3.py,file4.py
known_future_library=future,pies
known_standard_library=std,std2
known_third_party=randomthirdparty
known_first_party=mylib1,mylib2
indent='    '
multi_line_output=3
length_sort=1
forced_separate=django.contrib,django.utils
default_section=FIRSTPARTY
no_lines_before=LOCALFOLDER

The second way is to add your desired settings under a [tool.isort] section in the pyproject.toml

Python Logging Made Stupidly Simple With Loguru

Jul 14, 2020

The best logging package for Python!

Note that the default logging level is DEBUG in loguru and it is not allowed to change the logging level of an created logger object in loguru. You can refer to changing-the-level-of-an-existing-handler and Change level of default handler on ways to changing logging level in loguru.
1. Remove the default logger (with logging level equals DEBUG) and add a new one with the desired logging level.

Tips on Fbs

Jul 13, 2020

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

https://build-system.fman.io/

https://github.com/mherrmann/fbs-tutorial

New Features in Spark 3

Jun 27, 2020

AQE (Adaptive Query Execution)¶

To enable AQE, you have to set spark.sql.adaptive.enabled to true (using --conf spark.sql.adaptive.enabled=true in spark-submit or using `spark.config("spark.sql.adaptive,enabled", "true") in Spark/PySpark code.)

Pandas UDFs¶

Pandas UDFs are user defined functions that are executed by Spark using Arrow to transfer data to Pandas to work with the data, which allows vectorized operations. A Pandas UDF is defined using pandas_udf

Query Pandas Data Frames Using SQL