About 39,700,000 results
Open links in new tab
  1. PySpark: multiple conditions in when clause - Stack Overflow

    Jun 8, 2016 · 39 when in pyspark multiple conditions can be built using & (for and) and | (for or). Note:In pyspark t is important to enclose every expressions within parenthesis () that combine to form the …

  2. pyspark - How to use AND or OR condition in when in Spark - Stack …

    107 pyspark.sql.functions.when takes a Boolean Column as its condition. When using PySpark, it's often useful to think "Column Expression" when you read "Column". Logical operations on PySpark …

  3. Comparison operator in PySpark (not equal/ !=) - Stack Overflow

    Aug 24, 2016 · The selected correct answer does not address the question, and the other answers are all wrong for pyspark. There is no "!=" operator equivalent in pyspark for this solution.

  4. How to check if spark dataframe is empty? - Stack Overflow

    Sep 22, 2015 · 4 On PySpark, you can also use this bool(df.head(1)) to obtain a True of False value It returns False if the dataframe contains no rows

  5. spark dataframe drop duplicates and keep first - Stack Overflow

    Aug 1, 2016 · 2 I just did something perhaps similar to what you guys need, using drop_duplicates pyspark. Situation is this. I have 2 dataframes (coming from 2 files) which are exactly same except 2 …

  6. How to change dataframe column names in PySpark?

    I come from pandas background and am used to reading data from CSV files into a dataframe and then simply changing the column names to something useful using the simple command: df.columns =

  7. Filtering a Pyspark DataFrame with SQL-like IN clause

    Mar 8, 2016 · Filtering a Pyspark DataFrame with SQL-like IN clause Asked 9 years, 10 months ago Modified 3 years, 9 months ago Viewed 123k times

  8. Pyspark replace strings in Spark dataframe column

    Pyspark replace strings in Spark dataframe column Asked 9 years, 8 months ago Modified 1 year, 1 month ago Viewed 315k times

  9. How to change a dataframe column from String type to Double type in ...

    Aug 29, 2015 · I have a dataframe with column as String. I wanted to change the column type to Double type in PySpark. Following is the way, I did: toDoublefunc = UserDefinedFunction(lambda x: …

  10. PySpark: How to fillna values in dataframe for specific columns?

    Jul 12, 2017 · PySpark: How to fillna values in dataframe for specific columns? Asked 8 years, 5 months ago Modified 6 years, 8 months ago Viewed 202k times