Ben Chuanlong Du's Blog

It is never too late to learn.

Convert MS Office Document to Text

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

  1. xls2csv -f allow to parse also the lines with empty cell in first column xls2csv

  2. catdoc and xls2csv, catppt

  3. ssconvert file.csv file.xls (gnumeric)

Some Reading Notes About SSD

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

  1. Samsung 840 pro is a good one

  2. not necessary for SATA II interface

  3. SSD is not as reliable as HD

  4. MLC has much longer life than VLC

  5. don't defragment disk (on …

Cross Validation in Machine Learning

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Training and Testing Data Set

  • good when you have large amount of data

  • usually use 1/5 to 1/3 of the data as testing data set.

K-fold CV

  • suitable when …

Estimate FDR

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

The problem is actually to estimate the number of null hypotheses.

  • Benjamini

  • Nettleton

Regression Classification ANOVA

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Regression refers to problems where the response (output) variable is continous while classfication refers to problems where the response (output) variable is discrete.

Generally speaking fitting gression to classification problems is …

Java Features

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

  1. String in Switch

Java 7 allows use of strings in switch instead of just integers, which make things much more convenient (see the following example).

public void foo(Foo t) {
    String …