GLOBAL RESEARCH SYNDICATE
No Result
View All Result
  • Login
  • Latest News
  • Consumer Research
  • Survey Research
  • Marketing Research
  • Industry Research
  • Data Collection
  • More
    • Data Analysis
    • Market Insights
  • Latest News
  • Consumer Research
  • Survey Research
  • Marketing Research
  • Industry Research
  • Data Collection
  • More
    • Data Analysis
    • Market Insights
No Result
View All Result
globalresearchsyndicate
No Result
View All Result
Home Data Analysis

Catching up with Google BigQuery

globalresearchsyndicate by globalresearchsyndicate
March 2, 2020
in Data Analysis
0
Catching up with Google BigQuery
0
SHARES
9
VIEWS
Share on FacebookShare on Twitter

datameer.jpg

With the ink just drying on Google’s recently-closed acquisition of Looker, all eyes are turning to BigQuery as to plans for expanding the platform’s footprint. Feeding the anticipation is the fact that GCP’s cloud data warehousing rivals, including Microsoft, Oracle, and SAP, have recently expanded the scope of their offerings either to include back-end data integration or front end self-service BI visualization. While Google affirms that Looker will retain its multi-cloud platform support, within GCP, BigQuery appears to be the logical target for enhanced integration.

We recently sat in on an update call that reviewed recently introduced features ranging from the general availability of Redshift and S3 migration tooling and the in-memory BI engine to beta releases for Flex Slots and column-level security. BigQuery has been one of GCP’s fastest growing services, with the customer base having grown significantly over the past 18 months, and more importantly, with large flat-rate (as opposed to a la carte per query) customers doubling in numbers over the past year. In a just-published blog, Google pointed to large wins with customers such as KeyBank, Wayfair, Lowe’s, Sabre, and Lufthansa.

Big Query is unique in that, unlike most cloud data warehousing services, it is serverless. Traditionally, you used it on an ad hoc basis and didn’t worry about provisioning nodes, although later on, slot pricing was introduced to make BigQuery costs more predictable for large-scale users. Serverless is also useful for handling high-concurrency scenarios, with Google claiming that some BigQuery users have run up to 10,000 queries at once.

A typical scenario for BigQuery adoption is leveraging the platform’s scale, both in terms of data volumes (with petabyte size queries not unusual) and high concurrency. While there’s no equivalent of the CAP Theorem when it comes to scale vs. concurrency in analytic databases, for most data warehousing platforms, it’s usually a choice between one or the other.

Originally the outgrowth of Google’s log processing system, BigQuery is the platform on which the Dremel query engine was developed; that’s the engine on which Apache Drill was developed. BigQuery can store a variety of data going beyond typical relational structured data to formats such as Parquet, JSON, or CSV and can use cloud object storage as a source; while such extensibility is not unusual today among other cloud data warehousing platforms, BigQuery was one of the first to offer such extensibility.

BigQuery originally did not resemble a typical data warehouse, as it worked best when data is organized in nested structures that, at first blush, look more like JSON documents than typical SQL relational or star schemas. Since then, Google claims that BigQuery has evolved so it can now work efficiently with more traditional data warehouse schemas.

So, customers are likely to need some help when moving data to BigQuery given its unique layout. Partners such as Datometry and CompilerWorks have developed migration tools for moving workloads without having to rewrite queries. Informatica has developed a no code/low-code BigQuery integration tool that includes a six-step wizard aimed at less technical business users to guide them through the process. In turn, global SIs such as Accenture, Infosys, and Wipro have developed migration tooling as part of their own BigQuery practices. Google recently expanded its partnership with SADA Systems, a global consulting and managed services provider specializing in cloud that was also one of GCP’s original partners. They have re-upped with a $500 million agreement that will include support for migrations from Netezza, Teradata, and Hadoop to BigQuery.

When it comes to tooling, Google subscribes to a coopetition model; over the past year, it has made several acquisitions. At last year’s NEXT, Google announced Cloud Data Fusion, the result of its acquisition of the open source company behind the development of the open source technology CDAP, that runs data transformation pipelines inside Google Cloud Dataproc, GCP’s Hadoop service. Subsequently, Google acquired Alooma, which instead uses a staging server approach that is akin to AWS and Azure Database Migration services. While these offerings help round out GCP’s portfolio, as the upstart in the cloud platform ecosystem, we don’t expect Google to aggressively sell these services in competition to its partners.

One of the key selling points for cloud data platform providers is tapping the synergies across their portfolios. BigQuery’s federated query story is expected to go GA soon. Today it can reach into Cloud SQL (GCP’s MySQL and PostgreSQL services) and Bigtable (the NoSQL database that was the inspiration for Hadoop’s HBase). We believe that down the road, GCP will add Spanner to that list.

BigQuery has also gotten its feet wet with machine learning (ML) by making it more accessible to SQL developers. Typically, these capabilities enable developers to run ML models without having to write Python or R code, and for BigQuery, they now support various training models for linear regression (for predicting numerical values); K-means clustering (for customer segmentation); matrix factorization (in Alpha, for recommender systems); XGBoost; (for regression, classification, and ranking); Deep Neural Networks (using TensorFlow) and others.

Google is hardly alone here – having the ability to trigger ML models from SQL code so they can run inside the database without having to move data is starting to become a checkbox item. But most of the others (e.g., Amazon Redshift, Oracle, and SQL Server on-premises) typically treat R or Python programs used for ML as user-defined functions, rather than BigQuery’s storage of the models within the data sets themselves. Also, BigQuery’s serverless architecture has made the platform better suited for training models compared to most cloud data warehousing services.

So, what’s next? We expect that the obvious answer is how Google will blend Looker’s data integration and visualization capabilities into its broader data platform offering. With Microsoft recently unveiling Synapse, which places Azure Data Factory under a common service, Oracle extends the autonomous data warehouse to incorporate self-service data integration tools, while SAP has expanded its HANA Data Warehouse to use the analytics of SAP Analytics cloud, Looker and BigQuery look increasingly like they’re made for each other. But, we’re also interested in seeing whether Google will designate BigQuery as one of the services that could get supported under its Anthos hybrid platform. We expect we’ll get some of those answers in April at Google NEXT.

Related Posts

How Machine Learning has impacted Consumer Behaviour and Analysis
Consumer Research

How Machine Learning has impacted Consumer Behaviour and Analysis

January 4, 2024
Market Research The Ultimate Weapon for Business Success
Consumer Research

Market Research: The Ultimate Weapon for Business Success

June 22, 2023
Unveiling the Hidden Power of Market Research A Game Changer
Consumer Research

Unveiling the Hidden Power of Market Research: A Game Changer

June 2, 2023
7 Secrets of Market Research Gurus That Will Blow Your Mind
Consumer Research

7 Secrets of Market Research Gurus That Will Blow Your Mind

May 8, 2023
The Shocking Truth About Market Research Revealed!
Consumer Research

The Shocking Truth About Market Research: Revealed!

April 25, 2023
market research, primary research, secondary research, market research trends, market research news,
Consumer Research

Quantitative vs. Qualitative Research. How to choose the Right Research Method for Your Business Needs

March 14, 2023
Next Post
Crazy Facts About Long Island Duck Farming

Crazy Facts About Long Island Duck Farming

Categories

  • Consumer Research
  • Data Analysis
  • Data Collection
  • Industry Research
  • Latest News
  • Market Insights
  • Marketing Research
  • Survey Research
  • Uncategorized

Recent Posts

  • Ipsos Revolutionizes the Global Market Research Landscape
  • How Machine Learning has impacted Consumer Behaviour and Analysis
  • Market Research: The Ultimate Weapon for Business Success
  • Privacy Policy
  • Terms of Use
  • Antispam
  • DMCA

Copyright © 2024 Globalresearchsyndicate.com

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Cookie settingsACCEPT
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Non-necessary
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
SAVE & ACCEPT
No Result
View All Result
  • Latest News
  • Consumer Research
  • Survey Research
  • Marketing Research
  • Industry Research
  • Data Collection
  • More
    • Data Analysis
    • Market Insights

Copyright © 2024 Globalresearchsyndicate.com