Ben Chuanlong Du's Blog

It is never too late to learn.

Check Whether a File Exists in Spark

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

org.apache.hadoop.fs.FileSystem

val conf = sc.hadoopConfiguration
val fs = org.apache.hadoop.fs.FileSystem.get(conf)
val exists = fs.exists(new org.apache.hadoop.fs.Path("/path/on/hdfs …

Change Modified Time of Files

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

  1. Change the modifiled timestamp of a file to the specified timestamp.

    touch -m -t 201512180130.09 some_file
    
  2. Change the modified timestamp of a file to the current time.

    touch -m some_file …

Parallel Computing in Bash

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

  1. & and wait

  2. parallel

  3. xargs

parallel is a cool bash command which makes parallel computing easy in Bash. It is a parallel version replacement of xargs.