Tableau Public Python

Tableau has an Extract API which lets you create Data Extracts (tde files) with C/C/JavaScript/Python for Tableau from any external data-source like csv files, SQL, Excel. This got me thinking. This got me thinking.

Note: Tableau Public, the free license version of Tableau, does not support Python integration. TabPy Installation Reading the documentation, this should be as simple as. Tableau has released TabPy, Tableau Python Server, an API that enables Python code evaluation within Tableau. Thanks to TabPy, you can create calculated fields using Python code in Tableau 10.2. As significant as the R connection with Tableau 8.1, now Python lovers will be able to leverage all the power of advanced analytics and visualize the. I need to create a live Tableau dashboard, and the requirements are: data sources: Snowflake tables; create classifications using KNN, random forest, or XGBoost models; update weekly; I'm thinking about connect to SF database via live connection, then run python script for classification models via 'TabPy'.

Note: The following is a guest post by Tableau enthusiast Curtis Harris.

In 2015, I won my first fantasy-football title in a long-running, highly-competitive league.

Last year’s title was due in large part to a)
#The previous function created an empty row between each row of data. The function below brings back in every csv, removes the empty rows, and creates new csv files

for week in weeks:
input = open(str(week)+'.csv', 'rb')
output = open('Week'+str(week)+'.csv', 'wb')
writer = csv.writer(output)
for row in csv.reader(input):
if row:
writer.writerow(row)
input.close()
output.close()
#Delete unused csv files
for week in weeks:
os.remove(str(week)+'.csv')

Now that we have our data sorted out, we can use Tableau’s Union feature to combine all 17 of our csv files into a single data source. First goal accomplished—data collection is now almost entirely automated and will scale from year to year.

Simplification and scalability were my top concerns with this project. In previous years, I had to bounce back and forth between five to 10 different worksheets and dashboards, all of which would only make sense to myself. In order for the draft model to scale, the application needed to be simple to use and have a minimal amount of views that covered a wide range of analyses.

To design for simplicity means to design for commonality. As data-vizzers, we need to focus our user interface and user experience in a way that the greater population already understands.

With that in mind, I decided to develop this application using a collapsible-menu system that we see in many mobile apps. This method was originally surfaced by Robert Rouse on the InterWorks blog, and was recently extended to a template style by Josh Tapley.

Using the collapsible menu gave me expanded real estate for data visualization, and produced an interface that most are used to. The menu allows fantasy players to set scoring thresholds, avoid specific teams, and keep on top of the draft by crossing off players that have been drafted.

Public

So how did I use these methods to claim fantasy glory? As stated earlier, I believe the key to winning fantasy football is having a team that consistently performs at a high level. The visualizations in this dashboard focus on that theory.

Looking at the core of my offense, one can see that I focused on players with a low coefficient of variation. This is a measure of a player's season-long consistency. It’s also the reason why we don’t need to see trend charts in this dashboard.

By focusing on players with low variation and high points per game, I was able to glide into the playoffs and beat out my higher-ranked opponents who had high-variance players on their rosters.

This dashboard will also assist in recognizing value early in the draft where others focus on high-cost, high-variance players. By default, the bar chart is configured to rank players based on the consistency metrics discussed in the previous paragraph. Let’s take a look at wide receivers, for example:

Antonio Brown was the top receiver in most scoring formats, and he will likely be a top three draft pick in most leagues. I wouldn’t fault anyone for taking Brown with the top pick, but if you end up near the end of your draft order, consider Brandon Marshall.

Marshall was one of the most consistent wide receivers last year and also carried a high point-per-game average. He is also currently ranked #19 by industry experts, but one could argue that he is the top option at wide receiver.

Let’s take a look at the same chart if we take touchdown receptions out of the equation. Navigate to the Scoring Settings tab and move the Touchdown Receptions slider to 1. The analysis will now refactor and provide us with a different ranking. The ranking is now based on volume and yardage, instead of favoring touchdown-heavy receivers.

Marshall is now our second-ranked receiver, and we see that Julio Jones has moved to the top. If Jones converted only a handful more touchdowns in 2015, he would have easily been the top ranked receiver, and definitely deserves a top three draft pick with Brown.

Hopefully this dashboard proves to be a valuable tool in your upcoming draft. Remember, slow and steady wins the race. Drafting for consistency will help you have a successful and fun fantasy-football season!

Got a sports viz of your own? Share it in the comments below or tweet it to us using the hashtag #OlympicsViz.

Disclaimer: This topic includes information about a third-party product. Please note that while we make every effort to keep references to third-party content accurate, the information we provide here might change without notice as python changes. For the most up-to-date information, please consult the python documentation and support.

Python is a widely used high-level programming language for general-purpose programming. By sending Python commands to an external service through Tableau Prep Builder, you can easily extend your data preparation options by performing actions like adding row numbers, ranking fields, filling down fields and performing other cleaning operations that you might otherwise do using calculated fields.

To include Python scripts in your flow, you need to configure a connection between Tableau and a TabPy server. Then you can use Python scripts to apply supported functions to data from your flow using a pandas dataframe. When you add a script step to your flow and specify the configuration details, file, and function that you want to use, data is securely passed to the TabPy server, the expressions in the script are applied, and the results are returned as a table that you can clean or output as needed.

Connect Tableau To Python

Tableau

You can run flows that include script steps in Tableau Server as long as you have configured a connection to your TabPy server. Running flows with script steps in Tableau Online, isn't currently supported. To configure Tableau Server, see Configure the Tableau Python (TabPy) server for Tableau Server.

Prerequisites

To include Python scripts in your flow, complete the following setup. Creating or running flows with script steps in Tableau Online isn't currently supported.

  1. Download and install Python(Link opens in a new window). Download and install the most current version of Python for Linux, Mac or Windows.

  2. Download and install the Tableau Python server (TabPy(Link opens in a new window)). Follow the installation and configuration instructions for installing TabPy. Tableau Prep Builder uses TabPy to pass data from your flow through TabPy as the input, applies your script, then returns the results back to the flow.

  3. Install Pandas. Run pip3 install pandas. You must use a pandas data frame in your scripts to integrate with Tableau Prep Builder.

Configure the Tableau Python (TabPy) server for Tableau Server

Use the following instructions to configure a connection between your TabPy server and Tableau Server.

Tableau Prep Python

  • Version 2019.3 and later: You can run published flows that include script steps in Tableau Server.
  • Version 2020.4.1 and later: You can create, edit, and run flows that include script steps in Tableau Server.
  • Tableau Online: Creating or running flows with script steps isn't currently supported.
  1. Open the TSM command line/shell .
  2. Enter the following commands to set the host address, port values and connect timeout:

    tsm security maestro-tabpy-ssl enable --connection-type {maestro-tabpy-secure/maestro-tabpy} --tabpy-host <TabPy IP address or host name> --tabpy-port <TabPy port> --tabpy-username <TabPy username> --tabpy-password <TabPy password> --tabpy-connect-timeout-ms <TabPy connect timeout>

    • Select {maestro-tabpy-secure} to enable a secure connection or {maestro-tabpy} to enable an unsecured connection.
    • If you select {maestro-tabpy-secure}, specify the certificate file -cf<certificate file path> in the command line.
    • Specify the --tabpy-connect-timeout-ms <TabPy connect timeout> in milliseconds. For example --tabpy-connect-timeout-ms 900000.
  3. To disable the TabPy connection enter the following command

    tsm security maestro-tabpy-ssl disable

Create your python script

When you create your script, include a function that specifies a pandas (pd.DataFrame) as an argument of the function. This will call your data from Tableau Prep Builder. You will also need to return the results in a pandas (pd.DataFrame) using supported data types.

For example to add encoding to a set of fields in a flow, you could write the following script:

The following data types are supported:

Data type in Tableau Prep BuilderData type in Python
StringStandard UTF-8 string
DecimalDouble
IntInteger
BoolBoolean
DateString in ISO_DATE format “YYYY-MM-DD” with optional zone offset. For example, “2011-12-03” is a valid date.
DateTimeString in ISO_DATE_TIME format “YYYY-MM-DDT:HH:mm:ss” with optional zone offset. For example, “2011-12-03T10:15:30+01:00” is a vslid date.

Note: Date and DateTime must always be returned as a valid string.

If you want to return different fields than what you input, you'll need to include a get_output_schema function in your script that defines the output and data types. Otherwise, the output will use the fields from the input data, which are taken from the step that is prior to the script step in the flow.

Use the following syntax when specifying the data types for your fields in the get_output_schema:

Function in PythonResulting data type
prep_string ()String
prep_decimal ()Decimal
prep_int ()Integer
prep_bool ()Boolean
prep_date ()Date
prep_datetime ()DateTime

Tableau Public Pythons

The following example shows the get_output_schema function added to the field encoding python script:

Tableau Public Python Code

Connect to your Tableau Python (TabPy) server

Important: Starting in Tableau Prep Builder version 2020.3.3, you can configure your server connection once from the top Help menu instead of setting up your connection per flow in the Script step by clicking Connect to Tableau Python (TabPy) Server and entering your connection details. You will need to reconfigure your connection using this new menu for any flows that were created in an older version of Tableau Prep Builder that you open in version 2020.3.3.

  1. Select Help > Settings and Performance > Manage Analytics Extension Connection.
  2. In the Select an Analytics Extension drop-down list, select Tableau Python (TabPy) Server.

  3. Enter your credentials:
    • Port 9004 is the default port for TabPy.
    • If the server requires credentials, enter a username and password.
    • If the server uses SSL encryption, select the Require SSL check box, then click the Custom configuration file... link to specify a certificate for the connection.

      Note: Tableau Prep Builder doesn't provide a way to test the connection. If there is a problem with the connection an error message shows.

Add a script to your flow

Start your TabPy server then complete the following steps:

Note: TabPy requires tornado package version 5.1.1 to run. If you receive the error 'tornado.web' has no attribute 'asynchronous' when trying to start TabPy, from the command line run pip list to check the version of tornado that was installed. If you have a different version installed, download the tornado package version 5.1.1(Link opens in a new window). Then run pip uninstall tornado to uninstall your current version, then run pip install tornado5.1.1 to install the required version.

  1. Open Tableau Prep Builder and click the Add connection button.

    In web authoring, from the Home page, click Create > Flow or from the Explore page, click New > Flow. Then click Connect to Data.

  2. From the list of connectors, select the file type or server that hosts your data. If prompted, enter the information needed to sign in and access your data.

  3. Click the plus icon, and select Add Script from the context menu.

  4. In the Script pane, in the Connection type section, select Tableau Python (TabPy) Server.

  5. In the File Name section, click Browse to select your script file.
  6. Enter the Function Name then press Enter to run your script.

Thanks for your feedback!