Tableau Desktop Personal = $35 ( with annual maintenance $999) Tableau Desktop Professional = $70 ( with annual maintenance $1,999) Where is your Data Stored? For Tableau Desktop, reports published can be stored on your Local Drive as well as on the Tableau Server whereas, for Tableau Public, reports published can only be saved on the Tableau.
- Tableau Desktop Professional 2018.2
- Tableau Desktop Professional Edition 2020
- Install Tableau Desktop 2020
- Tableau Desktop Professional Edition 2020.1.3
- Download Tableau Desktop 2020
- If the license end date for Tableau Desktop or Tableau Prep Builder has been extended prior to a silent refresh attempt, the updated license term is updated automatically and downloaded into the trusted storage of the computer where Tableau Desktop or Tableau Prep Builder is running.
- Download Free eBook:Tableau Desktop Professional Edition v2020.1.3 - Free epub, mobi, pdf ebooks download, ebook torrents download.
- Academind Pro Tableau Desktop 2020 - A Complete Introduction Other Courses Free Courses online, free download Torrent of Phlearn, Pluralsight, Lynda, CBTNuggets, Laracasts, Coursera, Linkedin, Teamtreehouse etc. FreeCoursesOnline.Me.
The Bookshop data set was created for Tableau Desktop 2020.2 to showcase relationships—the new data modeling capabilities for combining data.
Download the files
You can choose to download the raw data and start from scratch building your own data model, or download one of the pre-built data sources to jump right into analysis with relationships.
- Raw data (xlsx)
- Bookshop.xlsx - the raw data as an .xlsx file
- BookshopLibraries.xlsx - additional library tables that introduce many-to-many relationships (contains only the additional library tables)
- Packaged data source (tdsx) files
- Bookshop.tdsx - a packaged .tdsx file with the related data source already built and some metadata customizations added
- MinimalBookshop.tdsx - the same tables as Bookshop.tdsx but without any metadata or cleanup
- Bookshop_libraries.tdsx - a packaged .tdsx file that adds the library tables to Bookshop.tdsx (includes all tables).
To connect to a downloaded .tdsx
- Open Tableau Desktop.
- In the Connect pane on the left, choose the More... option under the To a File heading.
- Navigate to where you downloaded the .tdsx and double click it (or select it then click Open).
About the data set
Bookshop consists of 13 tables, combined in the following manner:
Note: A central concept for this data set is the idea of a book versus an edition. A book is a concept with attributes such as author, title, and genre. An edition is a physical version of the book, with attributes such as format (hardcover, paperback), publication date, and page count.
Some fields may need a little explanation.
- The Rating field in the Ratings table is on a scale of 1-5 with 5 being high.
- The Format field is a detailed breakdown of the format. Everything other than “Hardcover” could be considered “Paperback”.
- ISBN stands for International Standard Book Number and is a unique 13 digit identifier given to each edition of a book. The ISBN is represented in the barcode and is tied to the price.
- The ItemID and OrderID fields are hierarchical. An order may contain multiple items.
- The Staff Comment field contains summaries and reviews for some of the books.
If you build your own data source
- The Publisher and Sales tables must be related to the Edition table.
- The Book and Info tables can be related or joined, but it must be on the calculation
BookID = [BookID1]+[BookID2].
- An inner join is suggested.
- The Series table can only be used once the Info table is part of the data source.
- We recommend renaming the logical table of Book joined with Info “Book” or something similar.
- The four Sales tables can be analyzed independently, but if they are unioned, we recommend renaming the union “Sales” or something similar.
- Most of the fields used to form relationships aren’t needed for analysis and can be hidden once the data model has been built
- Any field ending with ID can be hidden (the only “ID” fields that do are ReviewID and ReviewerID from Ratings and ItemID and OrderID from Sales).
- ISBN should be kept, ideally from the Edition table, as the identifier of each edition. The ISBN field in the unioned Sales table can be hidden.
- The Title field in the Award table can be hidden.
- The sheet and table fields from the Union also don’t contain unique information and can be hidden.
The BookshopLibraries.xlsx file introduces new tables around libraries for the purpose of supporting many-to-many relationships. The Catalog table relates to the Editions table on ISBN. The Library Profiles table relates to the Catalog table as a many-to-many (n:n) relationship on LibraryID.
- The Library field is the name of the library.
- The Library ID field an ID also used in the Catalog table.
- The Consortium Member field is a yes/no field that indicates if the library is part of the larger network of libraries that offer inter-library loans and share other services
- The Private field is a yes/no field that indicates if this is a members-only library or a public institution
- Staff Type and Number of Staff together describe how many Librarians, Library Assistants, and Library Technicians work for each library.
- LoanID is a unique identifier for the combination of ISBN and library that is used to track the Number of Copies a library has. For example, if the Idle Hour Library has two paperback copies and one hardcover copy of a title, there will be two Loan IDs.
- The Library ID field is an ID also used in the Library Profile table.
The Library Profile table was originally formatted as an unpivoted table with a column for each staff type, below.
|Library||LibraryID||Consortium Member||Private||Librarians||Library Technicians||Library Assistants|
Idle Hour Library
The Bibliophile’s Shelves
|Old Friend Library||L-OFL||Yes||No||3||5||17|
Page Station Book Exchange
The table was pivoted across the Librarians, Library Technicians, and Library Assistants columns. The final format has a column for Staff Type and a column for Number of Staff. However, this means that there is more than one row for each library. The Library ID values are therefore not unique, and the pivoted version of the table must be related many-to-many to the Catalog table.
While this has the benefit of allowing a many-to-one relationship between Catalog and Profile, it isn’t an ideal structure for data analysis(Link opens in a new window).
This data set, while fictional, supports various analytical scenarios and plenty of exploration. Some suggestions include:
- What books are the most popular? The least popular? Is that based on sales, reviews, checkouts, or another metric?
- Who was the youngest debut author? Who was the oldest?
- Do some publishing houses seem to specialize in any way?
- What was the longest time between editions of the same book?
- Are there any seasonal trends for sales? What about checkouts? Do any titles or genres have seasonal fluctuations?
- Are there any correlations between checkouts, print run size, book review ratings, and sales volume?
- Do the authors who spend the most time writing have the most successful books? Do they have the highest page count?
- When are most books published? Are there any anomalies?
- Are there any trends for genre, format, and price?
- What sort of distributions do the ratings have? Do those distributions vary by book? By genre? Do they seem to align with awards?
- How would you calculate the sales price, given that there is sometimes—but not always—a discount given at the time of sale?
- Do sales approximate the Pareto principle?
- Are there any patterns in the discounts?
- Do any tables in particular appear to have dirty data?
- Etaoin Shrdlu is a reference to linotype hot type machines. Palimpsest is a manuscript-making reference.
- Author names were derived from a list of famous American authors with their first and last names shuffled.
- Publication dates today are usually on Tuesdays. This was reflected in the data, which assumes this industry standard holds into the year 2178.
- One book has no reviews, checkouts, or sales.
- The Checkouts data is based actual library data, with BookIDs mapped to titles, so the checkout trends are real-world.
- The Ratings data is based on actual book rating data, with BookIDs mapped to titles, so the distribution of ratings for a given book is real.
- The proportion of sales that are for a single book versus multiple books is based on real-world data from an independent bookstore.
- The Sales data is completely generated, but is based on actual seasonal and weekday trends for a resort town with a tourism-based economy (proportionally by month and day of the week, and for spring break and the winter holidays).
- Sales spike over long weekends and spring break.
- Sales are much higher on weekends and at their lowest on Tuesdays and Thursdays.
- Peak season is the summer between the 4th of July and Labor Day.
- Sales spike immediately after Thanksgiving and in the weeks leading up to Christmas.
- The ISBNs in the data set are fictitious but constructed following ISBN-13 principles:
- EAN—989 (not actually in use)
- Group number—28
- Publisher registrant element—varying length (2-4 digits)
- Title element, varying length (3-5 digits, depending on the registrant element length)
- Check digit, created with the ISBN-13 check digit modular arithmetic calculation
- One of the developers working on this feature suggested a book with no editions, and the title The Deep Grey, so its BookID ends with the last three digits of his work phone number.
- There is an (intentional) error in the Edition table where two editions of BookID PA169 were labeled as Palimpsest Printing, but the ISBN indicates they were actually printed by Etaoin Shrdlu Press.
The person who created this data set had way too much fun making it, and clearly has a deep obsession with books. She hopes you enjoy the data set and it helps you appreciate the power, elegance, and beauty of relationships in Tableau.
The new data modeling capabilities create flexible data sources built around relationships.
Relationships combine data from different tables by looking at what columns (fields) those tables have in common and using that information to bring information from each table together in the analysis.
Unlike joins or unions, relationships form a data source without flattening multiple tables into a single table. Because of this, related data sources know which table each field is from. That means each field keeps its context, or level of detail. Related data sources can therefore handle tables with different granularity without issues of duplication or data loss.
In a related data source, the joins aren’t fixed up front. Instead of merging all the data (and having to work with all the data regardless of what each viz requires), only the relevant data is combined—per sheet and as necessary. As you drag and drop, Tableau evaluates the relationships of the relevant fields and tables. Those relationships are used to write queries with the correct join types, aggregations, and null handling.
You can think about how the data fits together and what questions you want to answer, rather than how to combine the data or compensate for artifacts from the data source.
Relationships don’t replace the previous ways of combining data: joins, unions, and blends. Rather, relationships are the new, flexible way to bring data together from multiple sources. The existing methods aren’t going away, and are still useful in specific scenarios.
Note: For related information on how relationship queries work, see these Tableau blog posts:
This topic walks through building a related data source and using one for analysis. If you would like to follow along, you can download the Bookshop data set(Link opens in a new window). Choose Bookshop.xlsx for the raw data to start from scratch, or MinimalBookshop.tdsx to start with the related data source basics configured for you.
Tableau Desktop Professional 2018.2
Note: Relationships are available in Tableau 2020.2 and later.
Video: Migrated Data
If you open an existing workbook from a previous version of Tableau (version 2020.1 or earlier) in Tableau 2020.2 or later, your data source tab might show a Migrated Data table.
There are multiple videos embedded in this topic. For your convenience, a transcript is provided in the expandable section underneath each video.Migrated Data video transcript
If you open a workbook created before relationships were available (before version 2020.2), your data source may look different. If there were joined or unioned tables, they appear as a single logical table named Migrated Data. Double click to see the underlying tables—they’re still there, and your analysis will still behave as expected.
Note that when you’re only using the Migrated Data table, the Data pane behaves more like previous versions of Tableau Desktop.
You can even relate new tables to the directly to the Migrated Data table to extend the data model.
Once you’ve added relationships to the data model, the Data pane updates to the new layout, with dimensions and measures maintained within each table, and the Migrated Data table is displayed as a single, merged table.
Video: Relate tablesRelate tables video transcript
To build a data source of related tables, connect to your data and drag a table to the canvas. Drag out a second table. A relationship will automatically form if it can. If Tableau cannot detect the related fields, you will be prompted to select them yourself.
The canvas you’re seeing is a new layer of the data model where you can relate tables together. Dragging out additional tables adds them to the data model. The noodle shows which tables are related.
Relationships can be based on calculated fields. You can also specify how fields should be compared by using operators when you define the relationship.
You can create a join by double clicking on a table to open the join canvas (also referred to as the physical layer), and then dragging out a table to create a join. Creating joins is the same experience as previous versions of Tableau—you can set a join clause, including a calculation if necessary, and configure the join type.
Once you close the join canvas, the table on the relationship canvas (also referred to as the logical layer) has a join icon to indicate the underlying join. This is because a join merges its tables together to form a single, new table. Related tables remain distinct.
Add unions by dragging out New Union. Unions, like joins, actually merge the data, and show an icon on the logical layer. We recommend always renaming unions—or any tables with unclear names—to something more useful.
Video: Performance OptionsPerformance Options video transcript
For any relationship, there are performance options as well as related fields. Tableau automatically chooses safe defaults for these options. However, if you are sure of the characteristics of your data, changing these options can give Tableau more information about how to best configure automatic joins during analysis. Setting the cardinality and referential integrity options gives Tableau more information about how to optimize queries.
Cardinality indicates whether or not the linking field records are unique. If multiple books can have the same author, but each author is listed only once in the author table, the cardinality for AuthID would be Many for Books and One for Author between those two tables. This setting impacts how Tableau handles aggregation before or after the join.
Referential integrity indicates whether or not a record in one table is guaranteed to have a relationship in another table. If every book has an author, Book has referential integrity to Author. If not every author has a book, Author does not have referential integrity to Book. This setting impacts how Tableau chooses a join type and handles unmatched records.
For more information, see Cardinality and Referential Integrity.
Example: Bookshop Data
We’ll do some analysis with this data source in a moment, so let’s talk about the data.
The data set is around (fictitious) books, and it’s important to consider the distinction between a book and an edition. A book is the conceptual work itself, with attributes such as the title, author, and genre. There are also editions of a book, with attributes such as a price and format (hardcover or paperback), that are identified with an ISBN. An edition of a book has a publisher and a page count, whereas a book may have won an award or be in a series.
You can follow along by downloading MinimalBookshop.tdsx, or be adventurous and build the model yourself from Bookshop.xlsx. Note that you can hide many of the ID fields once the tables are combined.
Video: Work with multiple tablesWork with multiple tables video transcript
The Data Source page
Before we go to a sheet tab, notice that the data grid view, below, shows only the fields from one table at a time. If we click to another table, we’ll see the data it contains. There is no overall view because related data is not brought together until it’s used in an analysis. For now, the data remains in each table and is therefore displayed by table.
If we click on a table with an underlying join or union, we do see that merged data, with any nulls that may involve.
The Data pane
When we click onto a sheet, you may notice that the Data pane looks different than in previous versions of Tableau. Instead of the primary organization of dimensions and measures, it’s now focused on tables. Each table has its own breakdown for dimensions and measures, indicated by this line. It’s possible to still group by folders instead, if you choose.
Number of Records and Count
There is also no Number of Records field. This is because the concept of Number of Records has changed. There isn’t a single Number of Records for the data source. Each table has a Count field, which can be thought of as a local Number of Records for that table.
To get a sense of the number of records for multiple tables at the same time, you can use Measure Names and Measure Values or multi-select the count fields and use Show Me to create a viz.
Video: Basic calculationsBasic calculations video transcript
There are a few calculations we can do to make this data set a bit friendlier. First, we can create a calculation for the full author name that brings together first name and last name. We’ll create a calculation called Author Name that is simply first name, a space, and last name. Note that this calculation stays in the Author table because it uses fields only from that table.
When we make a sale, the sales amount is a combination of the price—fixed by the publisher and specific to the edition, or the ISBN—and any discount at the time of the sale. So the calculation Sales Amount will be price times one minus the discount. Note that we’re using the ZN—or Zero Null—function to handle sales without discounts so we don’t get nulls. This calculation goes to the shared area at the bottom of the Data pane as it uses fields from multiple tables. Aggregated calculations will also go to the bottom of the Data pane.
Video: Sets and groupsSets and groups video transcript
We’ll create a set for books to identify if they’re in a series. In the Book table, right click on Title and select Create > Set. We’ll name it In a series. On the Condition tab, choose By field with Series Name, Count, as >= 1.
In the Edition table, the Format field is quite detailed, including specifying mass market and trade paperback. Sometimes that detail is useful, but sometimes we want to generalize between hardcover and paperback. Right click on Format field in the Data pane and select Create > Group. We’ll name it Cover. Control-click to select everything except Hardcover and then click Group. The name of the group is automatically highlighted, and we can call it Paperback.
Like with the Author Name calculation, these items we just created are relevant within a single table and therefore stay within those tables in the Data pane.
Analyze related data
Let’s do some comparisons of how analysis looks between relationships and joins. We recommend that you download the companion workbooks for Question 1 and Question 2 rather than continue with your own data source from above. If you identify issues with the data source configuration or steps for Question 1, note that they are intentional for the purposes of the scenario.
Question 1: How many editions are there for each book?
Important: The video for Question 1 was recorded on beta software. The released version of 2020.2 correctly handles our attempt to introduce an error in the related data source. If you follow along, your workbook will not match the video—the relationships values will be correct. Please bear with us while we attempt to outsmart our product and force an error for educational purposes.
Tableau Desktop Professional Edition 2020
We’ll set this up by doing some basic analysis with both data sources. We’ll build a bar chart with each one, looking at the number of editions by title.
Using the related data source:
Using the joined data source:
It’s immediately clear that Count is not the right aggregation for the joined data. However, something isn’t right with the related data, either. Let’s view the data and see what’s going on and how to fix it. Watch the video (or read its transcript) to see how.
Video: Data validation and fixing the data modelQuestion 1 video transcript
Install Tableau Desktop 2020
Here we’re looking at our joined data. On the joined viz, we’ll take a look at a mark’s underlying data.
This view data dialog has two tabs, Summary and Full Data.
- There are tons of rows for the same book—this is likely duplicated because of all the joins.
To handle the duplication, we’ll change the aggregation on the ISBN field in the view to Count Distinct. Remember that we’ll need to think about this duplication in almost every analytic scenario and address it with the correct aggregation, filtering, or how we structure the viz.
Let’s take a look at the structure of our related data source, and we’ll take a peek at what’s going on with our join as well. On the related viz, we’ll take a look at a mark’s underlying data.
And we see that the view data dialog again has two tabs, but this time it’s the Edition tab instead of Full Data. Every table used in the viz will have its own tab.
- There’s a doubling of rows for this book—and we know that joins can cause duplication, and that the Book table in this data source is really the Book table joined with Info and Award.
We can create a viz with Title and Award Name and scroll down to investigate. And sure enough, The Mallemaroking has won two awards. It would be better to relate these tables than to join them.
Tableau Desktop Professional Edition 2020.1.3
We’ll go back to the Data Source tab and remove the Awards table from the join and relate it instead. And now we don’t have to worry about the duplication issue any more for this data source.
Question 2: For authors with books in a series, who is has the most book tour events?
Video: Work with different levels of detailsQuestion 2 video transcript
Let’s look at which authors we’ll be working with. Using the related data source, we’ll bring Author Name and Series Name to Rows.
Because related data sources pare down unmatched data in tables that only contain dimensions, we’re focusing on the data that is relevant to us. We can see that there are seven authors, two of whom have written for two series. If you don’t want to see only matched values, you can restore those unmatched values (which is more like the legacy behavior) by going to the Analysis menu > Table Layout and checking Show Empty Rows.
Using the related data source:
Using the joined data source:
Our numbers here look a bit off.
For the joined data, we know there’s duplication because of the join, and we also know some authors span series. Because of this, we can’t just change the aggregation to, say, MIN or MAX, because we lose information for authors with works in multiple series.
What we really want is the number of events per series, visualized by author. This is a classic case for Level of Detail (LOD) expressions. We’ll create a calculation Series Events:
Note that the MIN is to handle duplication of events for a single series.
Now if we bring this new field to Columns instead of the original events field, we’ll get the correct values.
For the related data, we didn’t need to do any of that. Relationships are smart enough to understand the native level of detail and the way the Author table relates through the Book and Info tables through to the Series table, and to correctly join and aggregate the Events measure back to Author Name—all without having to write LOD calculations.
So don’t be scared of relating your data. See for yourself! You can download the Bookshop data(Link opens in a new window) or use some of your own. Try filtering, using table calculations, building a variety of chart types, configuring performance options, and pushing relationships as far as you can.
Tips when working with related data
- View Underlying Data often to verify what data a mark represents.
- If you aren’t required to join—and there are reasons you may need to—using a relationship provides more flexibility.
- If you don’t want to see only matched values, you can restore those unmatched values by going to the Analysis menu > Table Layout > Show Empty Rows.
Ready to tackle calculations with Relationships? Check out Don’t Be Scared of Calculations in Relationships.
Ready to keep exploring how to do complex analysis with Relationships? Check out Don’t Be Scared of Deeper Relationships.
For more information on the technical underpinnings of relationships straight from the Product Management team, check out the series on relationships on the Tableau Blog.