Skip to main content

Python Script

The Python script operator is a built-in notebook that can be used to write Python snippets to manipulate data using Pandas' functionalities (learn more about Pandas here.) The operator provides an interface to a Python run-time environment that can be exploited in its whole potential. You can write a Python snippet by dragging the Python script operator onto the canvas, adding code in the available space, and running it by pressing the play button in the bottom left. A new dataframe will be output in the bottom right.

Using Pandas

Pandas is automatically available for use in the Python operator. Invoke Pandas functions using the identifier pd.

Working With One Dataframe

The input dataframe is accessible using the name df. You can then manipulate the dataframe using normal Pandas syntax, e.g. df['time']. The output dataframe will be df after any changes, so there is no need to return anything.

In the following example, we use the Python operator to add a column named large city to our dataframe to indicate whether a city has a population of more than 7000.

Single dataframe input

Working With Two Dataframes

To access the input dataframes, use dfs[0] and dfs[1]. As there are now two dataframes involved, a return dataframe must be specified to produce an output.

In the following example, we use the Python operator to combine two dataframes and compute a new column, Score_change, based on their respective Score columns.