Ben Chuanlong Du's Blog

It is never too late to learn.

Tensor Transformations in TorchVision

Comments

  1. Transformations in torchvision.transforms work on images, tensors (representing images) and possibly on numpy arrays (representing images). However, a transformation (e.g., ToTensor) might work differently on different input types. So you'd be clear about what exactly a transformation function does. A good practice is to always convert your non-tensor input data to tensors using the transformation ToTensor

Cluster Management Made Easy with Ansible

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Installation

sudo pip3 install ansible

Configuration

Ansible looks for configuration file in the following order.

  1. ansible.cfg in the current directory.

  2. ~/.ansible.cfg

  3. /etc/ansible.cfg

Examples

Copy a file to …

Cluster Management Made Easy with the Python Package Fabric

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Ansible is a better alternative to Fabric. It is suggested that you use Ansible instead.

  1. Docstring will be displayed when you type the command fab -l.

  2. Invoke is for local use …

Hands on pandas.Series in Python

pandas.Series.str

  1. The attribute pandas.Series.str can only be used with Series of str values. You will either encounter an AttributionError (Can only use .str accessor with string values, which use np.object_ dtype in pandas) or find it to yield a Series of NaN's if you invoke it on a Series of non-string values. If you have control of the DataFrame, the preferred way is to cast the type the column to str

Split a Dataset into Train and Test Datasets in Python

Scikit-learn Compatible Packages

sklearn.model_selection.train_test_split is the best way to split a dataset into train and test subset for scikit-learn compatible packages (scikit-learn, XGBoost, LightGBM, etc.). It supports splitting both iterable objects (numpy array, list, pandas Series) and pandas DataFrames. When splitting an iterable object, it returns (train, test) where train and test are lists. When splitting a pandas DataFrame, it returns (train, test)