In Data Analytics Breakthrough, Cambridge Semantics Establishes Key Milestone for Loading and Querying Big Diverse

- Advertisement -
Alok Prasad
Alok Prasad

BOSTON–Cambridge Semantics, a provider of graph-based Smart Data management and exploratory analytic solutions firm, announced that its Anzo Graph Query Engine™ completed a load and query of one trillion triples as a Google Cloud Partner on the Google Cloud Platform in just under two hours, 100 times faster than the previous solution running the Lehigh University Benchmark (LUBM) at the same data scale, the company said in a statement.

cambridge-semantics“The LUBM results demonstrate that our Anzo Graph Query Engine can handle diverse data at big data scale while maintaining security, provenance and governance,” said Alok Prasad, president of Boston-based Cambridge Semantics. “The largest enterprises can now act with speed and agility in their data integration and analytics no matter the data volume, offering a clear competitive advantage. Many of our customers are already solving problems they couldn’t address before by exploiting our ability to offer end users a way to automatically query relationship-rich, diverse data in new and unexpected ways.”

The LUBM is an industry standard that evaluates the query performance of semantic web repositories over a large data set. A ‘triple’ consists of a subject, predicate and an object. In the LUBM test conducted on the Google Cloud Platform on Oct. 31, 2016, Cambridge Semantics’ Anzo Graph Query Engine was able to load and query 1.065 trillion triples in 1.98 hours, surpassing the previous LUBM benchmark of 220 hours set by Oracle in September 2014.

To place the data results in context, examples of one trillion triples includes:

  • six months of worldwide Google searches
  • 133 facts for each of the 7 billion people on earth
  • 100 million facts describing all the details of each of 10,000 clinical trial studies
  • 156 facts about each device connected to the internet

“A key challenge for semantic-based analytics has been enabling a load and query performance on very large data sets from a data lake in timeframes that offer an acceptable user experience,” said Barry Zane, vice president of engineering at Cambridge Semantics. “With the LUBM results, it’s been validated that a loading and query process that once took over a month’s worth of business hours can now be completed in less than two hours.”

“As modern data diversity and volumes grow, relational database management systems (RDBMS) are proving too inflexible, expensive and time-consuming for enterprises,” Zane said. “This benchmark record set by our Anzo Graph Query Engine signals a paradigm shift where graph-based online analytical processing (GOLAP) will find a central place in everyday business by taking on data analytics challenges of all shapes and sizes, rapidly accelerating time-to-value in data discovery and analytics.”

Cambridge Semantics’ Anzo Graph Query Engine is a clustered, in-memory graph analytics engine based on open semantic standards that enables users to develop ad hoc and interactive queries and analytics across very large interconnected rich data sets. The platform can be deployed behind the enterprise firewall on dedicated enterprise servers or, as in the case of this LUBM, provisioned automatically on cloud infrastructures such as the Google Cloud Platform.

Cambridge Semantics is part of the Google for Work Partner Program as a Google Cloud Platform Technology Partner which allows the company to extend its robust smart data solutions to organizations that wish to leverage Google Cloud Platform for a flexible and large-scale data platform.


Please enter your comment!
Please enter your name here