Scikit-learn Compatible Packages¶
sklearn.model_selection.train_test_split
is the best way to split a dataset into train and test subset
for scikit-learn compatible packages (scikit-learn, XGBoost, LightGBM, etc.).
It supports splitting both iterable objects (numpy array, list, pandas Series) and pandas DataFrames.
When splitting an iterable object,
it returns (train, test) where train and test are lists.
When splitting a pandas DataFrame,
it returns (train, test)
Tips on Darknet and Yolo
Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
https://pjreddie.com/darknet/tiny-darknet/
https://pjreddie.com/darknet/
Build Spark from Source
Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
You can download prebuilt binary Spark at https://spark.apache.org/downloads.html. This is where you should get started and it will likely satisfy your need most of the time …
Subtle Differences Among Spark DataFrame and PySpark Dataframe
Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
-
Besides using the
colfunction to reference a column, Spark/Scala DataFrame supports using$"col_name"(based on implicit conversion and must haveimport spark.implicit._) while PySpark DataFrame support using …
Use the Watch Command to Monitor Running Applications
Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
Report the number of PNG images in the directory 000 every 2 seconds.
watch "ls 000/*.png | wc -l"
Python Build Tools
Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
poetry
poetry is the best dependency management/packaging tool for Python and is widely adopted.
pybuilder/pybuilder
pypa/pipenv
pydoit
ninja
PlatformIO
Meson
buildbot/buildbot
Buildbot is a Python system to …