Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
Comments¶
There is no (direct) way of select all columns except a few from a table using SQL. However, this is easily doable with DataFrame APIs (pandas, Spark/PySpark, etc.).
In [1]:
import pandas as pd
import findspark
findspark.init("/opt/spark-3.0.1-bin-hadoop3.2/")
from pyspark.sql import SparkSession, DataFrame
from pyspark.sql.functions import *
from pyspark.sql.types import StructType
spark = SparkSession.builder.appName("Case/When").enableHiveSupport().getOrCreate()
In [2]:
df = spark.createDataFrame(
pd.DataFrame(
data=(
("Ben", "Du", 1),
("Ben", "Du", 2),
("Ben", "Tu", 3),
("Ben", "Tu", 4),
("Ken", "Xu", 1),
("Ken", "Xu", 9),
),
columns=("fname", "lname", "score"),
)
)
df.show()
Select all columns except score
.
In [4]:
df.select(*[col for col in df.columns if col not in ("score")]).show()