Tableau Software Partners with DataSift, Google BigQuery

2
Jul 2013

Tableau Software Partners with DataSift, Google BigQuery

comment icon3 comment(s) |

DataSift, Tableau and Google BigQuery logos

Tableau, DataSift and Google BigQuery are intent on taking social media analytics to the next level. The three recently forged a partnership that will enable users to seamlessly integrate their data within these platforms, allowing near endless possibilities for social media data analysis. Users will be able to delve deeper into data from Facebook, Twitter, Google+ and a slew of other social networks, faster than ever.

The collaboration got its first test at LeWeb 2013 in June. They turned out this nice Social Media Dashboard, gaining incredible insights from live social media data in real time.

Here’s how it works:

DataSift Collects the Data

DataSift connects users to conversations and interactions occurring on their social media channels as they happen. The platform captures all the data surrounding these social activities, with the option to combine and organize data how you see fit. You can even mix in lists and other data sources, gaining deeper levels of insight.

Google BigQuery Stores the Data

Once DataSift collects such a large amount of social data, it needs a place to store it. That’s where Google BigQuery comes in. BigQuery’s web-based, flexible infrastructure makes it an excellent platform to store and quickly access real-time data. Users can easily scale the interactive service to accommodate huge datasets.

Tableau Visualizes the Data

After the data is collected and stored, Tableau does what it does best – visualization. Using a native Google BigQuery connector, Tableau can easily create a live connection to social media data stored within BigQuery. After establishing that connection in mere seconds, Tableau gives users the ability to instantly visualize and analyze their data to the full extent of Tableau’s capabilities.

That’s it! 

The partnership between Tableau, DataSift and Google BigQuery checks off a massive item on the wish lists of analysts and marketers alike– real-time social media analysis. Users can finally make decisions and gain insight at the speed of social media. It’s easy to see the astronomical value this presents to any business engaged in social network activity.

Ready to Get Started?

If you’re ready to start capitalizing on your social media data, get in touch with us. We’ll get you set up.

Comments

July 2, 2013

Ben Sullins

Google BigQuery's Pricing Flawed Pricing Model

Great article Eric. I think there is an issue here w/ how Google BigQuery does pricing however. I know of some friends who tried it out, loved the ease of use and performance, but when dragging a single measure on to a view can cost hundreds or thousands of dollars found it to be cost prohibitive. I don't believe anyone should be using BigQuery with Tableau yet, at least not until they get this pricing issue sorted out for "chatty" query tools like Tableau.

July 2, 2013

eshiarla

Ben, thanks for the feedback.

Ben, thanks for the feedback. I agree, BigQuery's pricing model is structured in a way that demands analysts and developers know generally how much data they are processing. I see a lot of the cloud based services heading in a similar direction. Amazon Redshift is another good example, although they have a per hour calculation. BigQuery is still very much in its infancy, so we may see some shifts in their pricing model. In the meantime though, there are a couple things we can do to help mitigate costs.

Pricing wasn't addressed directly in this post but anyone that is curious can find a breakdown on Google's developer site -
https://developers.google.com/bigquery/pricing

In terms of data sizing and cost estimation best practices, I believe it's best to look into tools like the jobs.list API to monitor what queries you're running and the amount of data being processed. It's also worth noting you're only charged for the columns processed in a query, so you'll want to stay way from custom query connections which will process data columns even if they aren't being used in a particular view.

July 3, 2013

Felipe Hoffa

query with care

Great article Eric, and fair comment Ben: BQ costs can be high, for the data explorer that's not aware of the power deployed behind each query.

For example, let's say that for querying 1 GB of data, 5 powerful computers are "turned on" during 2 seconds exclusively for this query. And the cost of doing that is not too much.

But then, what about the data explorer that goes querying 1 TB of data? Same query, almost the same 2 seconds, with the same magic 'get results in few seconds' experience. But in this case, 5000 powerful computers where dedicated exclusively to this query.

So yeah, costs can go up quickly if the user is not aware of the massive tasks being deployed.

There are strategies to improve that, like partitioning tables: If only looking for a month of data, don't query the year table, query a table that only has data for the month. 1/12 of the cost - same results, big drop in price. Or sample the data: Use the hash() function to materialize a table with a 1/10th or a 1/100th of the original data. Costs can go down by a 1/100, and results should be representative. You can always validate with one query afterwards that the general and the sample results are in line.

Also for Tableau and other chatty clients, last months costs went down in a big way, as BigQuery added caching: Identical queries by the same user on the same day on the same data now are are cached by BQ - without charging for them again.

Search