# SodaSpark Tasks
This module contains a collection of tasks to run Data Quality tests using soda-spark library
prefect.tasks.sodaspark.sodaspark_tasks.SodaSparkScan(scan_def=None, df=None, **kwargs)[source]
Task for running a SodaSpark scan given a scan definition and a Spark Dataframe. For information about SodaSpark please refer to https://docs.soda.io/soda-spark/install-and-use.html. SodaSpark uses PySpark under the hood, hence you need Java to be installed on the machine where you run this task.
scan_def (str, optional): scan definition. Can be either a path to a YAML file containing the scan definition. Please refer to https://docs.soda.io/soda-sql/scan-yaml.html for more information. or the scan definition given as a valid YAML string
df (pyspark.sql.DataFrame, optional): Spark DataFrame. DataFrame where to run tests defined in the scan definition.
**kwargs (dict, optional): additional keyword arguments to pass to the Task constructor
Task run method. Execute a scan against a Spark DataFrame.
This documentation was auto-generated from commit e6bd04a
on September 7, 2022 at 21:06 UTC